Discussion:
assume, assert, enforce, @safe
Walter Bright via Digitalmars-d
2014-07-30 22:01:25 UTC
Permalink
I'd like to sum up my position and intent on all this.

1. I can discern no useful, practical difference between the notions of assume
and assert.

2. The compiler can make use of assert expressions to improve optimization, even
in -release mode.

3. Use of assert to validate input is utterly wrong and will not be supported.
Use such constructs at your own risk.

4. An assert failure is a non-recoverable error. The compiler may assume that
execution does not proceed after one is tripped. The language does allow
attempts to shut a program down gracefully after one is tripped, but that must
not be misconstrued as assuming that the program is in a valid state at that point.

5. assert(0); is equivalent to a halt, and the compiler won't remove it.

6. enforce() is meant to check for input errors (environmental errors are
considered input).

7. using enforce() to check for program bugs is utterly wrong. enforce() is a
library creation, the core language does not recognize it.

8. @safe is a guarantee of memory safety. It is not a guarantee that a program
passes all its assert expressions. -release does not disable @safe.

9. -noboundscheck does disable @safe's array bounds checks, however, the
compiler may assume that the array index is within bounds after use, even
without the array bounds check.


I am not terribly good at writing formal legalese specifications for this. I
welcome PR's to improve the specification along these lines, if you find any
Aha! Gotcha! issues in it. Of course, implementation errors for this in DMD
should be reported on bugzilla.
Andrei Alexandrescu via Digitalmars-d
2014-07-30 22:12:55 UTC
Permalink
Post by Walter Bright via Digitalmars-d
7. using enforce() to check for program bugs is utterly wrong. enforce()
is a library creation, the core language does not recognize it.
GAAAAAA!
Joseph Rushton Wakeling via Digitalmars-d
2014-07-30 22:39:52 UTC
Permalink
Post by Walter Bright via Digitalmars-d
3. Use of assert to validate input is utterly wrong and will not be supported.
Use such constructs at your own risk.
...
6. enforce() is meant to check for input errors (environmental errors are
considered input).
7. using enforce() to check for program bugs is utterly wrong. enforce() is a
library creation, the core language does not recognize it.
A question on that.

There are various places in Phobos where enforce() statements are used to
validate function input or class constructor parameters. For example, in
std.random.LinearCongruentialEngine, we have:

void seed(UIntType x0 = 1) @safe pure
{
static if (c == 0)
{
enforce(x0, "Invalid (zero) seed for "
~ LinearCongruentialEngine.stringof);
}
_x = modulus ? (x0 % modulus) : x0;
popFront();
}

while in RandomSample we have this method which is called inside the constructor:

private void initialize(size_t howMany, size_t total)
{
_available = total;
_toSelect = howMany;
enforce(_toSelect <= _available,
text("RandomSample: cannot sample ", _toSelect,
" items when only ", _available, " are available"));
static if (hasLength!Range)
{
enforce(_available <= _input.length,
text("RandomSample: specified ", _available,
" items as available when input contains only ",
_input.length));
}
}

The point is that using these enforce() statements means that these methods
cannot be nothrow, which doesn't seem particularly nice if it can be avoided.
Now, on the one hand, one could say that, quite obviously, these methods cannot
control their input. But on the other hand, it's reasonable to say that these
methods' input can and should never be anything other than 100% controlled by
the programmer.

My take is that, for this reason, these should be asserts and not enforce()
statements. What are your thoughts on the matter?
Andrei Alexandrescu via Digitalmars-d
2014-07-30 22:57:42 UTC
Permalink
Post by Joseph Rushton Wakeling via Digitalmars-d
Post by Walter Bright via Digitalmars-d
7. using enforce() to check for program bugs is utterly wrong. enforce() is a
library creation, the core language does not recognize it.
A question on that.
There are various places in Phobos where enforce() statements are used
to validate function input or class constructor parameters.
Yah, Phobos is a bit inconsistent about that. TDPL discusses the matter:
if a library is deployed in separation from the program(s) it serves, it
may as well handle arguments as "input". That's what e.g. the Windows
API is doing - it consistently considers all function arguments
"inputs", scrubs them, and returns error codes for all invalid inputs it
detects. In contracts, the traditional libc/Unix interface does little
checking, even a strlen(NULL) will segfault.

Phobos is somewhere in the middle - sometimes it verifies arguments with
enforce(), some other times it just uses assert().


Andrei
Jonathan M Davis via Digitalmars-d
2014-07-31 00:06:49 UTC
Permalink
On Wednesday, 30 July 2014 at 22:58:01 UTC, Andrei Alexandrescu
On 7/30/14, 3:39 PM, Joseph Rushton Wakeling via Digitalmars-d
Post by Joseph Rushton Wakeling via Digitalmars-d
Post by Walter Bright via Digitalmars-d
7. using enforce() to check for program bugs is utterly wrong. enforce() is a
library creation, the core language does not recognize it.
A question on that.
There are various places in Phobos where enforce() statements
are used
to validate function input or class constructor parameters.
Yah, Phobos is a bit inconsistent about that. TDPL discusses
the matter: if a library is deployed in separation from the
program(s) it serves, it may as well handle arguments as
"input". That's what e.g. the Windows API is doing - it
consistently considers all function arguments "inputs", scrubs
them, and returns error codes for all invalid inputs it
detects. In contracts, the traditional libc/Unix interface does
little checking, even a strlen(NULL) will segfault.
Phobos is somewhere in the middle - sometimes it verifies
arguments with enforce(), some other times it just uses
assert().
Yeah, we're not entirely consistent with it. However, if it would
definitely be a program bug for an argument to not pass a
particular condition, then it should be an assertion, and if it's
definitely likely to be program input (e.g. this is frequently
the case with strings), then exceptions are the appropriate
approach. It's the cases where it's not obviously program input
that's more debatable. Forcing checks and throwing exceptions
incurs overhead, but it can significantly reduce programming
bugs, because it doesn't put the onus on the programmer to verify
the arguments. Using assertions is more performant but can
significantly increase the risk of bugs - particularly when the
assertions will all be compiled out when Phobos is compiled into
a library unless the function is templated.

I know that Walter favors using assertions everywhere and then
providing functions which do the checks so that the programmer
can check and then throw if appropriate, but the check isn't
forced. Personally, I much prefer being defensive and to default
to checking the input and throwing on bad input but to provide a
way to avoid the check if you've already validated the input and
don't want the cost of the check. For instance, many of the
functions in std.datetime throw (particularly constructors),
because it's being defensive, but it's on my todo list to add
functions to bypass some of the checks (e.g. a function which
constructs the type without doing any checks in addition to
having the normal constructors). Regardless, I think that using
assertions as the go-to solution for validating function
arguments is generally going to result in a lot more programming
bugs. I'd much prefer to default to being safe but provide
backdoors for speed when you need it (which is generally how D
does things).

But regardless of which approach you prefer, there are some cases
where it's pretty clear whether an assertion or exception should
be used, and there are other cases where it's highly debatable -
primarily depending on whether you want to treat a function's
arguments as user input or rely on the programmer to do all of
the necessary validations first.

- Jonathan M Davis
Ary Borenszweig via Digitalmars-d
2014-07-30 23:05:35 UTC
Permalink
Post by Walter Bright via Digitalmars-d
I'd like to sum up my position and intent on all this.
7. using enforce() to check for program bugs is utterly wrong. enforce()
is a library creation, the core language does not recognize it.
What do you suggest to use to check program bugs?
H. S. Teoh via Digitalmars-d
2014-07-30 23:49:04 UTC
Permalink
Post by Ary Borenszweig via Digitalmars-d
Post by Walter Bright via Digitalmars-d
I'd like to sum up my position and intent on all this.
7. using enforce() to check for program bugs is utterly wrong. enforce()
is a library creation, the core language does not recognize it.
What do you suggest to use to check program bugs?
This is what assert is designed for.

But if you don't want to check ever to be removed, currently you have to
write:

if (!requiredCondition)
assert(0); // compiler never removes this

which IMO is relatively clear, and the use of assert(0) for forceful
termination is consistent with existing practice in D code.


T
--
Старый Ўруг лучше МПвых Ўвух.
Dicebot via Digitalmars-d
2014-07-31 01:16:41 UTC
Permalink
On Wednesday, 30 July 2014 at 23:50:51 UTC, H. S. Teoh via
Post by H. S. Teoh via Digitalmars-d
But if you don't want to check ever to be removed, currently
you have to
if (!requiredCondition)
assert(0); // compiler never removes this
which IMO is relatively clear, and the use of assert(0) for
forceful
termination is consistent with existing practice in D code.
Not helping.

```
import std.stdio;

void foo()
in { writeln("in contract"); }
body { }

void main() { foo(); }
```

Compile with -release and check the output.
Walter Bright via Digitalmars-d
2014-07-31 07:42:20 UTC
Permalink
Post by Dicebot via Digitalmars-d
Post by H. S. Teoh via Digitalmars-d
But if you don't want to check ever to be removed, currently you have to
if (!requiredCondition)
assert(0); // compiler never removes this
which IMO is relatively clear, and the use of assert(0) for forceful
termination is consistent with existing practice in D code.
Not helping.
```
import std.stdio;
void foo()
in { writeln("in contract"); }
body { }
void main() { foo(); }
```
Compile with -release and check the output.
What do you expect to happen?
Dicebot via Digitalmars-d
2014-07-31 15:48:41 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Post by Dicebot via Digitalmars-d
On Wednesday, 30 July 2014 at 23:50:51 UTC, H. S. Teoh via
Post by H. S. Teoh via Digitalmars-d
But if you don't want to check ever to be removed, currently
you have to
if (!requiredCondition)
assert(0); // compiler never removes this
which IMO is relatively clear, and the use of assert(0) for
forceful
termination is consistent with existing practice in D code.
Not helping.
```
import std.stdio;
void foo()
in { writeln("in contract"); }
body { }
void main() { foo(); }
```
Compile with -release and check the output.
What do you expect to happen?
It acts as defined in spec, nothing unexpected here. I was
referring to H. S. Teoh proposed workaround to keep assertions in
release mode - it does not work with contracts because those are
eliminated completely, not just assertions inside.
Walter Bright via Digitalmars-d
2014-07-31 07:37:18 UTC
Permalink
Post by Ary Borenszweig via Digitalmars-d
Post by Walter Bright via Digitalmars-d
I'd like to sum up my position and intent on all this.
7. using enforce() to check for program bugs is utterly wrong. enforce()
is a library creation, the core language does not recognize it.
What do you suggest to use to check program bugs?
assert()
Ary Borenszweig via Digitalmars-d
2014-07-31 18:43:48 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Post by Ary Borenszweig via Digitalmars-d
Post by Walter Bright via Digitalmars-d
I'd like to sum up my position and intent on all this.
7. using enforce() to check for program bugs is utterly wrong. enforce()
is a library creation, the core language does not recognize it.
What do you suggest to use to check program bugs?
assert()
Then you are potentially releasing programs with bugs that are of
undefined behavior, instead of halting the program immediately.
H. S. Teoh via Digitalmars-d
2014-07-31 19:34:56 UTC
Permalink
Post by Ary Borenszweig via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Post by Ary Borenszweig via Digitalmars-d
Post by Walter Bright via Digitalmars-d
I'd like to sum up my position and intent on all this.
7. using enforce() to check for program bugs is utterly wrong.
enforce() is a library creation, the core language does not
recognize it.
What do you suggest to use to check program bugs?
assert()
Then you are potentially releasing programs with bugs that are of
undefined behavior, instead of halting the program immediately.
Isn't that already what you're doing with the current behaviour of
assert? Not only in D, but also in C/C++.


T
--
When solving a problem, take care that you do not become part of the problem.
Jonathan M Davis via Digitalmars-d
2014-07-31 19:37:28 UTC
Permalink
Post by Ary Borenszweig via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Post by Ary Borenszweig via Digitalmars-d
Post by Walter Bright via Digitalmars-d
I'd like to sum up my position and intent on all this.
7. using enforce() to check for program bugs is utterly
wrong. enforce()
is a library creation, the core language does not recognize
it.
What do you suggest to use to check program bugs?
assert()
Then you are potentially releasing programs with bugs that are
of undefined behavior, instead of halting the program
immediately.
Then don't build with -release. You can even build with
-boundscheck=safe if you want to turn off bounds checking in
@system code like -release does. IIRC, the only things that
-release does are disable assertions, disable contracts, turn
assert(0) into a halt instruction, and disable bounds checking in
@system and @trusted code. So, if you want to keep the assertions
and contracts and whatnot in, just don't use -release and use
-boundscheck=safe to get the bounds checking changes that
-release does.

- Jonathan M Davis
Timon Gehr via Digitalmars-d
2014-07-31 20:49:17 UTC
Permalink
Post by Jonathan M Davis via Digitalmars-d
Post by Ary Borenszweig via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Post by Ary Borenszweig via Digitalmars-d
Post by Walter Bright via Digitalmars-d
I'd like to sum up my position and intent on all this.
7. using enforce() to check for program bugs is utterly wrong. enforce()
is a library creation, the core language does not recognize it.
What do you suggest to use to check program bugs?
assert()
Then you are potentially releasing programs with bugs that are of
undefined behavior, instead of halting the program immediately.
Then don't build with -release. You can even build with
code like -release does. IIRC, the only things that -release does are
disable assertions,
No, according to the OP -release assumes assertions to be true.
Post by Jonathan M Davis via Digitalmars-d
disable contracts, turn assert(0) into a halt
So, if you want to keep the assertions and contracts and whatnot in,
Unfortunately, if used pervasively, assertions and contracts and whatnot
may actually hog the speed of a program in a way that breaks the deal.

Disabling assertions (and whatnot), assuming assertions to be true (and
disabling whatnot) and leaving all assertions and whatnot in are
different trade-offs, of which assuming all assertions to be true is the
most dangerous one. Why hide this behaviour in '-release'?
Post by Jonathan M Davis via Digitalmars-d
just don't use -release and use -boundscheck=safe to get the bounds
checking changes that -release does.
- Jonathan M Davis
This leaves assertions and contracts in though.
Ary Borenszweig via Digitalmars-d
2014-07-31 19:38:39 UTC
Permalink
Post by H. S. Teoh via Digitalmars-d
Post by Ary Borenszweig via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Post by Ary Borenszweig via Digitalmars-d
Post by Walter Bright via Digitalmars-d
I'd like to sum up my position and intent on all this.
7. using enforce() to check for program bugs is utterly wrong.
enforce() is a library creation, the core language does not
recognize it.
What do you suggest to use to check program bugs?
assert()
Then you are potentially releasing programs with bugs that are of
undefined behavior, instead of halting the program immediately.
Isn't that already what you're doing with the current behaviour of
assert? Not only in D, but also in C/C++.
T
I don't program in those languages, and if I did I would always use
exceptions (at least in C++). I don't want to compromise the safety of
my programs and if they fail I want to get a clean backtrace, not some
random undefined behaviour resulting in a segfault.
Jonathan M Davis via Digitalmars-d
2014-07-31 19:39:54 UTC
Permalink
On Thursday, 31 July 2014 at 19:36:34 UTC, H. S. Teoh via
On Thu, Jul 31, 2014 at 03:43:48PM -0300, Ary Borenszweig via
Post by Ary Borenszweig via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Post by Ary Borenszweig via Digitalmars-d
Post by Walter Bright via Digitalmars-d
I'd like to sum up my position and intent on all this.
7. using enforce() to check for program bugs is utterly
wrong.
enforce() is a library creation, the core language does not
recognize it.
What do you suggest to use to check program bugs?
assert()
Then you are potentially releasing programs with bugs that are of
undefined behavior, instead of halting the program immediately.
Isn't that already what you're doing with the current behaviour
of
assert? Not only in D, but also in C/C++.
Yes, but I think that his point was that he wants a way to check
programming bugs in release mode, and Walter was saying not to
use enforce for checking programming bugs. So, that leaves the
question of how to check them in release mode, since assertions
won't work in release mode. But the answer to that is normally to
not compile in release mode. And I believe that dmd gives enough
control over that that you can get everything that -release does
without disabling assertions using flags other than -release.

- Jonathan M Davis
H. S. Teoh via Digitalmars-d
2014-07-31 19:54:10 UTC
Permalink
On Thursday, 31 July 2014 at 19:36:34 UTC, H. S. Teoh via Digitalmars-d
On Thu, Jul 31, 2014 at 03:43:48PM -0300, Ary Borenszweig via
[...]
Post by Ary Borenszweig via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Post by Ary Borenszweig via Digitalmars-d
What do you suggest to use to check program bugs?
assert()
Then you are potentially releasing programs with bugs that are of
undefined behavior, instead of halting the program immediately.
Isn't that already what you're doing with the current behaviour of
assert? Not only in D, but also in C/C++.
Yes, but I think that his point was that he wants a way to check
programming bugs in release mode, and Walter was saying not to use
enforce for checking programming bugs. So, that leaves the question of
how to check them in release mode, since assertions won't work in
release mode. But the answer to that is normally to not compile in
release mode. And I believe that dmd gives enough control over that
that you can get everything that -release does without disabling
assertions using flags other than -release.
[...]

Ah, I see.

But doesn't that just mean that you shouldn't use -release, period?
AFAIK, the only thing -release does it to remove various safety checks,
like array bounds checks, asserts, contracts (which are generally
written using asserts), etc.. I'd think that Ary wouldn't want any of
these disabled, so he shouldn't use -release at all. There's already -O
and -inline to enable what people generally expect from a release build,
so -release wouldn't really be needed at all.

Right?


T
--
Some ideas are so stupid that only intellectuals could believe them. -- George Orwell
Ary Borenszweig via Digitalmars-d
2014-07-31 20:17:13 UTC
Permalink
Post by H. S. Teoh via Digitalmars-d
On Thursday, 31 July 2014 at 19:36:34 UTC, H. S. Teoh via Digitalmars-d
On Thu, Jul 31, 2014 at 03:43:48PM -0300, Ary Borenszweig via
[...]
Post by Ary Borenszweig via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Post by Ary Borenszweig via Digitalmars-d
What do you suggest to use to check program bugs?
assert()
Then you are potentially releasing programs with bugs that are of
undefined behavior, instead of halting the program immediately.
Isn't that already what you're doing with the current behaviour of
assert? Not only in D, but also in C/C++.
Yes, but I think that his point was that he wants a way to check
programming bugs in release mode, and Walter was saying not to use
enforce for checking programming bugs. So, that leaves the question of
how to check them in release mode, since assertions won't work in
release mode. But the answer to that is normally to not compile in
release mode. And I believe that dmd gives enough control over that
that you can get everything that -release does without disabling
assertions using flags other than -release.
[...]
Ah, I see.
But doesn't that just mean that you shouldn't use -release, period?
AFAIK, the only thing -release does it to remove various safety checks,
like array bounds checks, asserts, contracts (which are generally
written using asserts), etc.. I'd think that Ary wouldn't want any of
these disabled, so he shouldn't use -release at all. There's already -O
and -inline to enable what people generally expect from a release build,
so -release wouldn't really be needed at all.
Right?
T
That's exactly my point, thank you for summing that up :-)

I don't see the point of having a "-release" flag. It should be renamed
to "-a-bit-faster-but-unsafe".

I think there are other languages that do quite well in terms of
performance without disabling bounds checks and other stuff.
H. S. Teoh via Digitalmars-d
2014-07-31 20:33:34 UTC
Permalink
[...]
Post by Ary Borenszweig via Digitalmars-d
Post by H. S. Teoh via Digitalmars-d
But doesn't that just mean that you shouldn't use -release, period?
AFAIK, the only thing -release does it to remove various safety
checks, like array bounds checks, asserts, contracts (which are
generally written using asserts), etc.. I'd think that Ary wouldn't
want any of these disabled, so he shouldn't use -release at all.
There's already -O and -inline to enable what people generally expect
from a release build, so -release wouldn't really be needed at all.
Right?
T
That's exactly my point, thank you for summing that up :-)
I don't see the point of having a "-release" flag. It should be
renamed to "-a-bit-faster-but-unsafe".
It's probably named -release because traditionally, turning off asserts
and bounds checks is what is normally done when building a C/C++ project
for release. (C/C++ don't have built-in bounds checks, but I've seen
projects that use macros for manually implementing this, which are
#define'd away when building for release.)
Post by Ary Borenszweig via Digitalmars-d
I think there are other languages that do quite well in terms of
performance without disabling bounds checks and other stuff.
It depends on what you're doing. If you have assert's in CPU intensive
inner loops, turning them off can make a big difference in performance.


T
--
Almost all proofs have bugs, but almost all theorems are true. -- Paul Pedersen
Timon Gehr via Digitalmars-d
2014-07-31 20:53:20 UTC
Permalink
Post by H. S. Teoh via Digitalmars-d
[...]
Post by H. S. Teoh via Digitalmars-d
Post by H. S. Teoh via Digitalmars-d
But doesn't that just mean that you shouldn't use -release, period?
AFAIK, the only thing -release does it to remove various safety
checks, like array bounds checks, asserts, contracts (which are
generally written using asserts), etc.. I'd think that Ary wouldn't
want any of these disabled, so he shouldn't use -release at all.
There's already -O and -inline to enable what people generally expect
from a release build, so -release wouldn't really be needed at all.
Post by H. S. Teoh via Digitalmars-d
Right?
T
That's exactly my point, thank you for summing that up:-)
I don't see the point of having a "-release" flag. It should be
renamed to "-a-bit-faster-but-unsafe".
It's probably named -release because traditionally, turning off asserts
and bounds checks is what is normally done when building a C/C++ project
for release. (C/C++ don't have built-in bounds checks, but I've seen
projects that use macros for manually implementing this, which are
#define'd away when building for release.)
The suggestion is not: 'make -release disable assertions.' it is
'failing assertions are undefined behaviour in -release'.
Tobias Müller via Digitalmars-d
2014-07-30 23:51:30 UTC
Permalink
Post by Walter Bright via Digitalmars-d
2. The compiler can make use of assert expressions to improve
optimization, even in -release mode.
I can see the benefits of that, but I consider it very dangerous.

It similar to undefined behavior in C/C++. There the 'assume/assert' is
implicit not explicit, but it's still the same effect.

If the assume/assert is hidden somewhere in a function you basically
introduce new traps for UB.

Initially I was strong proponent of such optimizations:
(a + a)/2 can be optimized to just a for signed integers, that's nice, the
classic example. This inserts an implicit assume(a < INT_MAX/2).

My opinion suddenly changed when I realized that such assumptions (explicit
or implicit) can also propagate up/backwards and leak into a bigger
context.
A wrong assumption can introduce bugs in seemingly unrelated parts of the
program that would actually be correct on their own.

With relatively 'dumb' compilers, this is not a big problem, but optimizers
are more and more clever and will take profit of such assumptions if they
can.

Tobi
Andrei Alexandrescu via Digitalmars-d
2014-07-31 00:29:47 UTC
Permalink
Post by Tobias Müller via Digitalmars-d
With relatively 'dumb' compilers, this is not a big problem, but optimizers
are more and more clever and will take profit of such assumptions if they
can.
That's true, and it seems like a growing trend. Relevant threads:

https://groups.google.com/a/isocpp.org/forum/#!topic/std-proposals/9S5jNRW-5wY

http://www.spinics.net/lists/gcchelp/msg41714.html

Recent versions of gcc and clang have become increasingly aggressive
about optimizing code by taking advantage of making undefined behavior
_really_ undefined. There's been a couple of posts in the news recently
that I can't find at the moment.


Andrei
Andrei Alexandrescu via Digitalmars-d
2014-07-31 00:32:10 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
Post by Tobias Müller via Digitalmars-d
With relatively 'dumb' compilers, this is not a big problem, but optimizers
are more and more clever and will take profit of such assumptions if they
can.
https://groups.google.com/a/isocpp.org/forum/#!topic/std-proposals/9S5jNRW-5wY
http://www.spinics.net/lists/gcchelp/msg41714.html
Recent versions of gcc and clang have become increasingly aggressive
about optimizing code by taking advantage of making undefined behavior
_really_ undefined. There's been a couple of posts in the news recently
that I can't find at the moment.
I think I found it: http://www.redfelineninja.org.uk/daniel/?p=307

Andreu
Johannes Pfau via Digitalmars-d
2014-07-31 13:32:22 UTC
Permalink
Am Wed, 30 Jul 2014 17:32:10 -0700
Post by Andrei Alexandrescu via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Post by Tobias Müller via Digitalmars-d
With relatively 'dumb' compilers, this is not a big problem, but optimizers
are more and more clever and will take profit of such assumptions
if they can.
https://groups.google.com/a/isocpp.org/forum/#!topic/std-proposals/9S5jNRW-5wY
http://www.spinics.net/lists/gcchelp/msg41714.html
Recent versions of gcc and clang have become increasingly aggressive
about optimizing code by taking advantage of making undefined
behavior _really_ undefined. There's been a couple of posts in the
news recently that I can't find at the moment.
I think I found it: http://www.redfelineninja.org.uk/daniel/?p=307
Andreu
Also this:

Linus Torvalds On GCC 4.9: Pure & Utter Crap
http://www.phoronix.com/scan.php?page=news_item&px=MTc1MDQ

(This actually is a GCC bug, but valid behaviour for normal C++ code.
GCC only broke the compiler switch to explicitly force non-standard
behaviour)
Daniel Gibson via Digitalmars-d
2014-07-31 13:44:35 UTC
Permalink
Post by Johannes Pfau via Digitalmars-d
Am Wed, 30 Jul 2014 17:32:10 -0700
Post by Andrei Alexandrescu via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
Post by Tobias Müller via Digitalmars-d
With relatively 'dumb' compilers, this is not a big problem, but optimizers
are more and more clever and will take profit of such assumptions
if they can.
https://groups.google.com/a/isocpp.org/forum/#!topic/std-proposals/9S5jNRW-5wY
http://www.spinics.net/lists/gcchelp/msg41714.html
Recent versions of gcc and clang have become increasingly aggressive
about optimizing code by taking advantage of making undefined
behavior _really_ undefined. There's been a couple of posts in the
news recently that I can't find at the moment.
I think I found it: http://www.redfelineninja.org.uk/daniel/?p=307
Andreu
Linus Torvalds On GCC 4.9: Pure & Utter Crap
http://www.phoronix.com/scan.php?page=news_item&px=MTc1MDQ
(This actually is a GCC bug, but valid behaviour for normal C++ code.
GCC only broke the compiler switch to explicitly force non-standard
behaviour)
And don't forget this (rather old) case:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537
(I really don't get why anyone would want such an optimization: I want
an optimizer to use clever inlining, use SSE etc where it makes sense
and stuff like that - but not to remove code I wrote.)

Cheers,
Daniel
Artur Skawina via Digitalmars-d
2014-07-31 15:26:11 UTC
Permalink
And don't forget this (rather old) case: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537
(I really don't get why anyone would want such an optimization: I want an optimizer to use clever inlining, use SSE etc where it makes sense and stuff like that - but not to remove code I wrote.)
That is actually not a bug, but a perfectly valid optimization. The
compiler isn't clairvoyant and can not know that some data that you
wrote, but never read back, matters.

The solution is to tell the compiler that you really need that newly
(over-)written data. Eg

asm {"" : : "m" (*cast(typeof(password[0])[9999999]*)password.ptr); }

(yes, stdizing compiler barriers would be a good idea)

artur
Daniel Gibson via Digitalmars-d
2014-07-31 15:37:22 UTC
Permalink
Post by Artur Skawina via Digitalmars-d
And don't forget this (rather old) case: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537
(I really don't get why anyone would want such an optimization: I want an optimizer to use clever inlining, use SSE etc where it makes sense and stuff like that - but not to remove code I wrote.)
That is actually not a bug, but a perfectly valid optimization. The
compiler isn't clairvoyant and can not know that some data that you
wrote, but never read back, matters.
I don't want the compiler to care about that. When I tell it to write
something, I want it to do that, even if it might look like nonsense (if
anything, it could create a warning).
Post by Artur Skawina via Digitalmars-d
The solution is to tell the compiler that you really need that newly
(over-)written data. Eg
asm {"" : : "m" (*cast(typeof(password[0])[9999999]*)password.ptr); }
inline asm is not portable
Post by Artur Skawina via Digitalmars-d
(yes, stdizing compiler barriers would be a good idea)
C11 defines a memset_s which is guaranteed not to be optimized away..
that's 9 years after that bugreport and will probably never be supported
by MSVC (they don't even support C99).
One could write a memset_s oneself.. that does a memset, reads the data
and writes a char of it or something to a global variable (hoping that
the compiler won't optimize that to "just set that variable to 0").

The thing is: I don't want a compiler to remove code I wrote just
because it "thinks" it's superfluous.
It could tell me about it as a warning, but it shouldn't just silently
do it. If removing code makes my code faster, I can do it myself.

Cheers,
Daniel
David Nadlinger via Digitalmars-d
2014-07-31 15:54:13 UTC
Permalink
Post by Daniel Gibson via Digitalmars-d
If removing code makes my code faster, I can do it myself.
No, in general you can't. Opportunities for dead code elimination
often only present themselves after inlining and/or in generic
(as in templates) code.

David
bearophile via Digitalmars-d
2014-07-31 15:55:04 UTC
Permalink
Post by Daniel Gibson via Digitalmars-d
Post by Artur Skawina via Digitalmars-d
asm {"" : : "m"
(*cast(typeof(password[0])[9999999]*)password.ptr); }
inline asm is not portable
See:
https://d.puremagic.com/issues/show_bug.cgi?id=10661
Post by Daniel Gibson via Digitalmars-d
C11 defines a memset_s which is guaranteed not to be optimized
away..
that's 9 years after that bugreport and will probably never be
supported by MSVC (they don't even support C99).
See:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366877%28v=vs.85%29.aspx

Bye,
bearophile
Artur Skawina via Digitalmars-d
2014-07-31 16:02:32 UTC
Permalink
Post by Artur Skawina via Digitalmars-d
And don't forget this (rather old) case: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537
(I really don't get why anyone would want such an optimization: I want an optimizer to use clever inlining, use SSE etc where it makes sense and stuff like that - but not to remove code I wrote.)
That is actually not a bug, but a perfectly valid optimization. The
compiler isn't clairvoyant and can not know that some data that you
wrote, but never read back, matters.
I don't want the compiler to care about that. When I tell it to write something, I want it to do that, even if it might look like nonsense (if anything, it could create a warning).
This approach becomes completely impractical once generic code (templates)
enters the picture. But even for simple cases, it does not work. Keep in
mind that D initializes every object by default...
Post by Artur Skawina via Digitalmars-d
The solution is to tell the compiler that you really need that newly
(over-)written data. Eg
asm {"" : : "m" (*cast(typeof(password[0])[9999999]*)password.ptr); }
inline asm is not portable
That's why a portable compiler barrier interface is needed.
But this was just an example showing a zero-cost solution. A portable
fallback is always possible (the bug report was about C code -- there,
a loop that reads the data and stores a copy into a volatile location
would work).

artur
Andrei Alexandrescu via Digitalmars-d
2014-07-31 16:29:16 UTC
Permalink
When I tell it to write something, I want it to do that, even if it
might look like nonsense (if anything, it could create a warning).
I'm afraid those days are long gone by now. -- Andrei
Daniel Gibson via Digitalmars-d
2014-07-31 17:40:45 UTC
Permalink
Post by Andrei Alexandrescu via Digitalmars-d
When I tell it to write something, I want it to do that, even if it
might look like nonsense (if anything, it could create a warning).
I'm afraid those days are long gone by now. -- Andrei
At least for C..
It sucks not to be able to predict the behavior of sane-looking code
without knowing the language standard in all details by heart.

I'd prefer if D only did optimizations that are safe and don't change
the behavior of the code and thus was less painful to use than C and C++
that need deep understanding of complicated standards to get right ("the
last thing D needs is someone like Scott Meyers"?)

Examples for eliminations that I'd consider safe:
* if something is checked twice (e.g. due to inlining) it's okay to only
keep the first check
* if a variable is certainly not read before the first assignment, the
standard initialization could be optimized away.. same for multiple
assignments without reads in between, *but*
- that would still look strange in a debugger when you'd expect
another value
- don't we have int x = void; to prevent default initialization?
* turning multiplies into shifts and additions is totally fine if it
does not change the behavior, even in corner cases.
* optimizing out variables

It's a major PITA to debug problems that only happen in release builds.

Cheers,
Daniel
John Colvin via Digitalmars-d
2014-07-31 17:57:31 UTC
Permalink
Post by Daniel Gibson via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
When I tell it to write something, I want it to do that, even if it
might look like nonsense (if anything, it could create a
warning).
I'm afraid those days are long gone by now. -- Andrei
At least for C..
It sucks not to be able to predict the behavior of sane-looking
code without knowing the language standard in all details by
heart.
I'd prefer if D only did optimizations that are safe and don't
change the behavior of the code and thus was less painful to
use than C and C++ that need deep understanding of complicated
standards to get right ("the last thing D needs is someone like
Scott Meyers"?)
* if something is checked twice (e.g. due to inlining) it's
okay to only keep the first check
* if a variable is certainly not read before the first
assignment, the standard initialization could be optimized
away.. same for multiple assignments without reads in between,
*but*
- that would still look strange in a debugger when you'd
expect another value
- don't we have int x = void; to prevent default
initialization?
* turning multiplies into shifts and additions is totally fine
if it does not change the behavior, even in corner cases.
* optimizing out variables
It's a major PITA to debug problems that only happen in release builds.
Cheers,
Daniel
I believe gdc has the full suite of gcc optimiser flags
available. You don't need to just slap -O3 on and then complain
about the changes it makes, you can choose with a reasonable
amount of precision which optimisations you do/don't want done.
Walter Bright via Digitalmars-d
2014-07-31 22:02:13 UTC
Permalink
I believe gdc has the full suite of gcc optimiser flags available. You don't
need to just slap -O3 on and then complain about the changes it makes, you can
choose with a reasonable amount of precision which optimisations you do/don't
want done.
DMC++ has similar switches, but nobody uses them, and they are pretty much
useless anyway, because:

1. you need to be a compiler guy to know what they mean - very few people know
what "code hoisting" is

2. it's the combination of optimizations that produces results - they are not
independent of each other

This is why I didn't include such switches in DMD.
Walter Bright via Digitalmars-d
2014-07-31 21:59:28 UTC
Permalink
Post by Daniel Gibson via Digitalmars-d
It's a major PITA to debug problems that only happen in release builds.
Debugging optimized code was a well known problem even back in the 70's. Nobody
has solved it, and nobody wants unoptimized code.
Daniel Gibson via Digitalmars-d
2014-07-31 22:21:46 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Post by Daniel Gibson via Digitalmars-d
It's a major PITA to debug problems that only happen in release builds.
Debugging optimized code was a well known problem even back in the 70's.
Nobody has solved it, and nobody wants unoptimized code.
Yeah, and because of this I'd like optimizations not to cause different
behavior if at all possible to keep these kind of bugs as low as possible.

And I agree with your stance on those fine-grained optimization switches
from your other post. GCC currently has 191 flags the influence
optimization[1] (+ a version that negates them for most), and I don't
understand what most of them do, so it would be hard for me to decide
which optimizations I want and which I don't want.

However, what about an extra flag for "unsafe" optimizations?
I'd like the compiler to do inlining, replacing int multiplications with
powers of two with shifts and other "safe" optimizations that don't
change the semantics of my program (see the examples in the post you
quoted), but I *don't* want it to e.g. remove writes to memory that
isn't read afterwards or make assumptions based on assertions (that are
disabled in the current compile mode).

And maybe a warning mode that tells me about "dead"/"superfluous" code
that would be eliminated in an optimized build so I can check if that
would break anything for me in that respect without trying to understand
the asm output would be helpful.

Cheers,
Daniel


[1] according to $ gcc-4.8 --help=optimizer | grep "^ -" | wc -l
Walter Bright via Digitalmars-d
2014-08-01 01:12:11 UTC
Permalink
And I agree with your stance on those fine-grained optimization switches from
your other post. GCC currently has 191 flags the influence optimization[1]
Just consider this from a testing standpoint. As I mentioned previously,
optimizations interact with each other to produce emergent behavior. GCC has 191
factorial different optimizers. Google's calculator puts 191! at infinity, which
it might as well be.
However, what about an extra flag for "unsafe" optimizations?
There's been quite a creep of adding more and more flags. Each one of them is,
in a way, a failure of design, and we are all too quick to reach for that.
I *don't*
want it to e.g. remove writes to memory that isn't read afterwards
That's what volatileStore() is for.
or make assumptions based on assertions (that are disabled in the current compile mode).
This is inexorably coming. If you cannot live with it, I suggest writing your
own version of assert, using the Phobos 'enforce' implementation as a model.
It'll do what you want.
And maybe a warning mode that tells me about "dead"/"superfluous" code that
would be eliminated in an optimized build so I can check if that would break
anything for me in that respect without trying to understand the asm output
would be helpful.
If you compile DMD with -D, and then run it with -O --c, it will present you
with a list of all the data flow optimizations performed on the code. It's very
useful for debugging the optimizer. Although I think you'll find it
illuminating, you won't find it very useful - for one thing, a blizzard of info
is generated.
John Colvin via Digitalmars-d
2014-07-31 16:36:26 UTC
Permalink
Post by Daniel Gibson via Digitalmars-d
Post by Artur Skawina via Digitalmars-d
Post by Daniel Gibson via Digitalmars-d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537
(I really don't get why anyone would want such an
optimization: I want an optimizer to use clever inlining, use
SSE etc where it makes sense and stuff like that - but not to
remove code I wrote.)
That is actually not a bug, but a perfectly valid
optimization. The
compiler isn't clairvoyant and can not know that some data
that you
wrote, but never read back, matters.
I don't want the compiler to care about that. When I tell it to
write something, I want it to do that, even if it might look
like nonsense (if anything, it could create a warning).
just because it "thinks" it's superfluous.
The idea that the compiler simply lowers your code to
well-written assembly died decades ago. The job of an optimiser
is to *not* use your code but instead to compile a program that
is equivalent to yours but faster/smaller. What is equivalent is
defined by the language spec.
Daniel Murphy via Digitalmars-d
2014-07-31 17:26:35 UTC
Permalink
Post by Daniel Gibson via Digitalmars-d
One could write a memset_s oneself.. that does a memset, reads the data
and writes a char of it or something to a global variable (hoping that the
compiler won't optimize that to "just set that variable to 0").
Some compilers will do exactly that optimization.
Post by Daniel Gibson via Digitalmars-d
The thing is: I don't want a compiler to remove code I wrote just because
it "thinks" it's superfluous.
It could tell me about it as a warning, but it shouldn't just silently do
it. If removing code makes my code faster, I can do it myself.
No you don't, no you can't. You are using an optimizer because writing it
in the perfectly precise, perfectly fast way makes your code unmaintainable.
You want the optimizer to delete all those never-read initializations, drop
all those temporary variables, and turn your multiplies into shifts and
additions.

If you didn't, you wouldn't be using an optimizer.
Walter Bright via Digitalmars-d
2014-07-31 22:04:30 UTC
Permalink
No you don't, no you can't. You are using an optimizer because writing it in
the perfectly precise, perfectly fast way makes your code unmaintainable. You
want the optimizer to delete all those never-read initializations, drop all
those temporary variables, and turn your multiplies into shifts and additions.
If you didn't, you wouldn't be using an optimizer.
Even "unoptimized" code does optimizations.

If you really want unoptimized code, what you write is what you get, the inline
assembler beckons!
Walter Bright via Digitalmars-d
2014-07-31 22:08:01 UTC
Permalink
Post by Artur Skawina via Digitalmars-d
Post by Daniel Gibson via Digitalmars-d
Post by Artur Skawina via Digitalmars-d
The solution is to tell the compiler that you really need that newly
(over-)written data. Eg
asm {"" : : "m" (*cast(typeof(password[0])[9999999]*)password.ptr); }
inline asm is not portable
That's why a portable compiler barrier interface is needed.
But this was just an example showing a zero-cost solution. A portable
fallback is always possible (the bug report was about C code -- there,
a loop that reads the data and stores a copy into a volatile location
would work).
This is not a "barrier" operation. There you are thinking of atomic operations.
This is a case of a "volatile" operation, and this supports it for D:

https://github.com/D-Programming-Language/druntime/pull/892

Of course, someone has to actually pull it!
Artur Skawina via Digitalmars-d
2014-07-31 22:57:57 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Post by Artur Skawina via Digitalmars-d
Post by Daniel Gibson via Digitalmars-d
Post by Artur Skawina via Digitalmars-d
The solution is to tell the compiler that you really need that newly
(over-)written data. Eg
asm {"" : : "m" (*cast(typeof(password[0])[9999999]*)password.ptr); }
inline asm is not portable
That's why a portable compiler barrier interface is needed.
But this was just an example showing a zero-cost solution. A portable
fallback is always possible (the bug report was about C code -- there,
a loop that reads the data and stores a copy into a volatile location
would work).
https://github.com/D-Programming-Language/druntime/pull/892
It's a _compiler_ barrier and has nothing to do with atomic ops or
volatile. It simply tells the compiler (in this case) 'i'm going
to read the data in the memory locations pointed to by the password.ptr'.
That means that the compiler has to make sure that the data is there,
before the `asm` executes; it can not assume that the stores are dead
and can not optimize them away. The actual (emitted) asm does nothing.
It's just a way to communicate to the compiler that the data is needed.
Since in this case the point was just to overwrite /other/ security
sensitive data present at this location, nothing else is necessary. We
don't actually care about the new content, we only pretend we do, so
that the compiler isn't able to optimize across this barrier.

Exposing compiler barriers in a portable way in D would definitively be
a good idea. Relatively decent implementations can be easily done, for
example, the above functionality can be achieved via a pure function
that takes a reference to a static array. The function would do nothing,
just immediately return to the caller; it'd just need to be opaque from
the optimizers POV. This version wouldn't be zero-cost, like the example
above, but still very cheap (usually just a call+ret sequence), correct
and enough for many not perf-sensitive use cases like the one described
in that bug report.

artur
H. S. Teoh via Digitalmars-d
2014-07-31 16:36:01 UTC
Permalink
On Thu, Jul 31, 2014 at 03:44:35PM +0200, Daniel Gibson via Digitalmars-d wrote:
[...]
Post by Daniel Gibson via Digitalmars-d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537
(I really don't get why anyone would want such an optimization: I want
an optimizer to use clever inlining, use SSE etc where it makes sense
and stuff like that - but not to remove code I wrote.)
[...]

Modern compilers often have to deal with generated code (that isn't
directly written by the programmer, e.g., expanded from a C++ template
-- or, for that matter, generated by a code generator like lex / yacc).
In this case, you *do* want dead code removal because the code generator
may be written in a way that takes care of the general case, but in your
specific case some of the generated code is redundant. You don't want
to penalize specific instances of the generic code pattern, after all.


T
--
The richest man is not he who has the most, but he who needs the least.
Andrew Godfrey via Digitalmars-d
2014-07-31 17:01:37 UTC
Permalink
On Thursday, 31 July 2014 at 16:37:40 UTC, H. S. Teoh via
Post by H. S. Teoh via Digitalmars-d
On Thu, Jul 31, 2014 at 03:44:35PM +0200, Daniel Gibson via
[...]
Post by Daniel Gibson via Digitalmars-d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537
(I really don't get why anyone would want such an
optimization: I want
an optimizer to use clever inlining, use SSE etc where it
makes sense
and stuff like that - but not to remove code I wrote.)
[...]
Modern compilers often have to deal with generated code (that
isn't
directly written by the programmer, e.g., expanded from a C++
template
-- or, for that matter, generated by a code generator like lex
/ yacc).
In this case, you *do* want dead code removal because the code
generator
may be written in a way that takes care of the general case,
but in your
specific case some of the generated code is redundant. You
don't want
to penalize specific instances of the generic code pattern,
after all.
Both points of view make sense. The problem is that it's hard for
the compiler to know when the code it elides was generated code
or explicitly written. (Maybe this is solvable in dmd, I don't
know. But it's not a feature I've seen before.)
John Colvin via Digitalmars-d
2014-07-31 17:22:45 UTC
Permalink
On Thursday, 31 July 2014 at 15:26:27 UTC, Artur Skawina via
Post by Artur Skawina via Digitalmars-d
Post by Daniel Gibson via Digitalmars-d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537
(I really don't get why anyone would want such an
optimization: I want an optimizer to use clever inlining, use
SSE etc where it makes sense and stuff like that - but not to
remove code I wrote.)
That is actually not a bug, but a perfectly valid optimization.
The
compiler isn't clairvoyant and can not know that some data that
you
wrote, but never read back, matters.
The solution is to tell the compiler that you really need that
newly
(over-)written data. Eg
asm {"" : : "m"
(*cast(typeof(password[0])[9999999]*)password.ptr); }
(yes, stdizing compiler barriers would be a good idea)
artur
Any idea how dead store removal interacts with the modern C(++)
memory model? Another thread could hold a reference to the memory
being written to.
Artur Skawina via Digitalmars-d
2014-07-31 19:43:46 UTC
Permalink
Post by Artur Skawina via Digitalmars-d
And don't forget this (rather old) case: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537
(I really don't get why anyone would want such an optimization: I want an optimizer to use clever inlining, use SSE etc where it makes sense and stuff like that - but not to remove code I wrote.)
That is actually not a bug, but a perfectly valid optimization. The
compiler isn't clairvoyant and can not know that some data that you
wrote, but never read back, matters.
Any idea how dead store removal interacts with the modern C(++) memory model? Another thread could hold a reference to the memory being written to.
In case of local/stack and TLS objects the compiler can often prove that
there are no other refs (eg because the address is never escaped).

artur
Walter Bright via Digitalmars-d
2014-07-31 07:47:54 UTC
Permalink
Post by Tobias Müller via Digitalmars-d
With relatively 'dumb' compilers, this is not a big problem, but optimizers
are more and more clever and will take profit of such assumptions if they
can.
If D wishes to be competitive, it must go down that path.

If you, as a user, do not wish this behavior, then do not use -release.

The documentation for -release says:

"compile release version, which means not generating code for contracts and
asserts. Array bounds checking is not done for system and trusted functions."

https://dlang.org/dmd-windows.html
John Colvin via Digitalmars-d
2014-07-31 08:25:25 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Post by Tobias Müller via Digitalmars-d
With relatively 'dumb' compilers, this is not a big problem,
but optimizers
are more and more clever and will take profit of such
assumptions if they
can.
If D wishes to be competitive, it must go down that path.
If you, as a user, do not wish this behavior, then do not use
-release.
"compile release version, which means not generating code for
contracts and asserts. Array bounds checking is not done for
system and trusted functions."
https://dlang.org/dmd-windows.html
It's worth noting that ldc (and gdc maybe, can't remember) offer
some finer control over what does/doesn't get eliminated, e.g.
-enable-contracts -disable-asserts etc.
Dicebot via Digitalmars-d
2014-07-31 01:15:03 UTC
Permalink
Post by Walter Bright via Digitalmars-d
I am not terribly good at writing formal legalese
specifications for this. I welcome PR's to improve the
specification along these lines, if you find any Aha! Gotcha!
issues in it. Of course, implementation errors for this in DMD
should be reported on bugzilla.
What is missing is not formal specification but clear guidelines
"how to use this system in production". Right now it is pretty
clear that you have implemented something that non-zero amount of
experienced D developers have no clue how to use without botching
the application completely. This does indicate that something is
wrong with the feature even you are perfectly right theoretically.

Currently there is http://dlang.org/contracts.html but neither it
nor any of referenced materials does explain to me:

- how to distribute binary library packages in presence of
contracts
- how to organize your application to ensure that contracts can
be removed in release builds
- are those even applicable to majority of applications

I am less concerned with just assert behavior because there are
many ways to workaround it to get different semantics. But
contract system... no clues.
Andrei Alexandrescu via Digitalmars-d
2014-07-31 03:52:37 UTC
Permalink
Post by Walter Bright via Digitalmars-d
I am not terribly good at writing formal legalese specifications for
this. I welcome PR's to improve the specification along these lines,
if you find any Aha! Gotcha! issues in it. Of course, implementation
errors for this in DMD should be reported on bugzilla.
What is missing is not formal specification but clear guidelines "how to
use this system in production". Right now it is pretty clear that you
have implemented something that non-zero amount of experienced D
developers have no clue how to use without botching the application
completely. This does indicate that something is wrong with the feature
even you are perfectly right theoretically.
Currently there is http://dlang.org/contracts.html but neither it nor
- how to distribute binary library packages in presence of contracts
- how to organize your application to ensure that contracts can be
removed in release builds
- are those even applicable to majority of applications
I am less concerned with just assert behavior because there are many
ways to workaround it to get different semantics. But contract system...
no clues.
It would be awesome if you (a) documented formally the current behavior
and (b) submit bug reports wherever you find egregious faults in it. --
Andrei
Dicebot via Digitalmars-d
2014-07-31 04:01:21 UTC
Permalink
On Thursday, 31 July 2014 at 03:52:55 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
It would be awesome if you (a) documented formally the current
behavior and (b) submit bug reports wherever you find egregious
faults in it. -- Andrei
It is documented at http://dlang.org/contracts.html
As for bug reports - how can I file those if I have no clue how
system is supposed to be used? I don't think it is necessarily
bad or buggy, just can't fin the way to make good use of it. Or
do you expect "please teach me use contracts" bug report? :)
Andrei Alexandrescu via Digitalmars-d
2014-07-31 04:24:30 UTC
Permalink
Post by Dicebot via Digitalmars-d
Post by Andrei Alexandrescu via Digitalmars-d
It would be awesome if you (a) documented formally the current
behavior and (b) submit bug reports wherever you find egregious faults
in it. -- Andrei
It is documented at http://dlang.org/contracts.html
As for bug reports - how can I file those if I have no clue how system
is supposed to be used? I don't think it is necessarily bad or buggy,
just can't fin the way to make good use of it. Or do you expect "please
teach me use contracts" bug report? :)
There's this term "learned helplessness". You're a smart guy who can
slice and dice things. -- Andrei
Dicebot via Digitalmars-d
2014-07-31 05:03:44 UTC
Permalink
On Thursday, 31 July 2014 at 04:24:49 UTC, Andrei Alexandrescu
Post by Andrei Alexandrescu via Digitalmars-d
There's this term "learned helplessness". You're a smart guy
who can slice and dice things. -- Andrei
This is one of those moments when I am not sure if you are
trolling me yet again or being strangely serious. I am supposed
to research and propose brand new approach to contract
programming because I don't understand how existing one should
work? And I am being told that by one of the authors of existing
system?

You are taking that "go try yourself" thing way out of proportion.
Tofu Ninja via Digitalmars-d
2014-07-31 02:05:29 UTC
Permalink
Post by Walter Bright via Digitalmars-d
3. Use of assert to validate input is utterly wrong and will
not be supported. Use such constructs at your own risk.
When exactly is it 'ok' to use assert then?

If asserts are not allowed to be used to verify inputs.... then
by extension, they can not be used to verify any derivative of an
input. So by that definition, asserts are only ok to use on any
thing known at compile time... that makes them utterly useless....
H. S. Teoh via Digitalmars-d
2014-07-31 17:08:49 UTC
Permalink
Post by Tofu Ninja via Digitalmars-d
Post by Walter Bright via Digitalmars-d
3. Use of assert to validate input is utterly wrong and will not be
supported. Use such constructs at your own risk.
When exactly is it 'ok' to use assert then?
If asserts are not allowed to be used to verify inputs.... then by
extension, they can not be used to verify any derivative of an input.
No that doesn't make sense. The idea behind input sanitization is that
user-facing APIs receive arbitrary, unverified input from outside, and
before you use that data, you scrub it. After scrubbing it, it's
perfectly OK to use assert to verify its validity -- because if it's
still invalid, there's a bug in your scrubbing algorithm. The scrubbed
input *is* a derivative of the input, but you can't say that you can't
use assert on it!

(And if you *don't* scrub your data before using it, your design is
flawed and should be fixed.)
Post by Tofu Ninja via Digitalmars-d
So by that definition, asserts are only ok to use on any thing known
at compile time... that makes them utterly useless....
Your definition is wrong. ;-) The idea behind asserts is that you're
verifying *program logic*, not checking arbitrary input data. If you
have an algorithm that computes a square root, for example, you'd use an
assert to verify that the square of the result equals the input --
because if not, that means something has gone wrong with your square
root algorithm. But you shouldn't use an assert to verify that the input
to the square root algorithm is a valid numerical string -- because the
user could have typed "abc" instead of a number. Rather, you should
scrub the user input and throw an exception when the input is invalid.

After scrubbing, however, it's perfectly valid to assert that the input
must be a valid number -- because if not, it means the *logic* in your
scrubbing algorithm is flawed (perhaps it missed a corner case).

It's true, however, that this simple idea is not always so simple in
practice. One has to draw a line between "user input" and "internal
state" somewhere, and it's not always obvious where that line falls. For
example, viewed as a whole, the entire software application may be
considered a system that takes input from outside, so "user input" means
things like keystrokes, mouse movements, etc.. But internally, the
system may consist of multiple components, and when data is passed
between components, should they be treated as "internal state" or "user
input"? Should library APIs treat application input as "external input"
and vet it before use, or is it OK to use assert to enforce(!) the
validity of data passed internally between the application's components?
It's not always easy to decide, and sometimes judgment calls have to be
made.


T
--
If creativity is stifled by rigid discipline, then it is not true creativity.
Tofu Ninja via Digitalmars-d
2014-07-31 17:32:49 UTC
Permalink
On Thursday, 31 July 2014 at 17:10:27 UTC, H. S. Teoh via
On Thu, Jul 31, 2014 at 02:05:29AM +0000, Tofu Ninja via
On Wednesday, 30 July 2014 at 22:01:23 UTC, Walter Bright
Post by Walter Bright via Digitalmars-d
3. Use of assert to validate input is utterly wrong and will
not be
supported. Use such constructs at your own risk.
When exactly is it 'ok' to use assert then?
If asserts are not allowed to be used to verify inputs....
then by
extension, they can not be used to verify any derivative of an
input.
No that doesn't make sense. The idea behind input sanitization
is that
user-facing APIs receive arbitrary, unverified input from
outside, and
before you use that data, you scrub it. After scrubbing it, it's
perfectly OK to use assert to verify its validity -- because if
it's
still invalid, there's a bug in your scrubbing algorithm. The
scrubbed
input *is* a derivative of the input, but you can't say that
you can't
use assert on it!
(And if you *don't* scrub your data before using it, your
design is
flawed and should be fixed.)
So by that definition, asserts are only ok to use on any thing
known
at compile time... that makes them utterly useless....
Your definition is wrong. ;-) The idea behind asserts is that
you're
verifying *program logic*, not checking arbitrary input data.
If you
have an algorithm that computes a square root, for example,
you'd use an
assert to verify that the square of the result equals the input
--
because if not, that means something has gone wrong with your
square
root algorithm. But you shouldn't use an assert to verify that
the input
to the square root algorithm is a valid numerical string --
because the
user could have typed "abc" instead of a number. Rather, you
should
scrub the user input and throw an exception when the input is
invalid.
After scrubbing, however, it's perfectly valid to assert that
the input
must be a valid number -- because if not, it means the *logic*
in your
scrubbing algorithm is flawed (perhaps it missed a corner case).
It's true, however, that this simple idea is not always so
simple in
practice. One has to draw a line between "user input" and
"internal
state" somewhere, and it's not always obvious where that line
falls. For
example, viewed as a whole, the entire software application may
be
considered a system that takes input from outside, so "user
input" means
things like keystrokes, mouse movements, etc.. But internally,
the
system may consist of multiple components, and when data is
passed
between components, should they be treated as "internal state"
or "user
input"? Should library APIs treat application input as
"external input"
and vet it before use, or is it OK to use assert to enforce(!)
the
validity of data passed internally between the application's
components?
It's not always easy to decide, and sometimes judgment calls
have to be
made.
T
With that logic(and the proposed optimizations that this whole
thing is about), weird stuff like this happens...

void foo(int x)
{
if(x != 0) throw ...;
assert(x == 0);
}

The if check could be removed because assert will be assumed to
always be true in release... so x could never not equal 0.... the
assert just nuked my scrubbing logic...
Daniel Murphy via Digitalmars-d
2014-07-31 18:14:13 UTC
Permalink
With that logic(and the proposed optimizations that this whole thing is
about), weird stuff like this happens...
void foo(int x)
{
if(x != 0) throw ...;
assert(x == 0);
}
The if check could be removed because assert will be assumed to always be
true in release... so x could never not equal 0.... the assert just nuked
my scrubbing logic...
The if can't be removed - and it's fairly easy to see why. In the control
flow path that contains the assert, the compiler is _already_ sure that x ==
0. The assert adds no new information.

The assumption the compiler can make is "if the program got to here, this
condition must be true". The qualification is extremely important.

The corner case is "assert(0)". It means "if the program got to here, the
impossible has happened."

So with this:

void foo(int x)
{
if(x != 0) throw ...;
assert(0);
}

the compiler doesn't have to bother checking x at all.
Artur Skawina via Digitalmars-d
2014-07-31 19:32:30 UTC
Permalink
With that logic(and the proposed optimizations that this whole thing is about), weird stuff like this happens...
void foo(int x)
{
if(x != 0) throw ...;
assert(x == 0);
}
The if check could be removed because assert will be assumed to always be true in release... so x could never not equal 0.... the assert just nuked my scrubbing logic...
The if can't be removed - and it's fairly easy to see why. In the control flow path that contains the assert, the compiler is _already_ sure that x == 0. The assert adds no new information.
As long as the assert is 100% correct. If you have a hundred+ asserts
and a 1% error rate...
A wrong assert could (under the proposed model) propagate the wrong
assumptions both ways. Silently disabling other checks that would
have otherwise caught the error.

Imagine creating a hotfix for some newly discovered bug, and forgetting
to update an assert expression somewhere. Unless the problem is
triggered while testing a non-release build, you may end up shipping a
broken product, even one with bugs that were not present in the original.
Now imagine that somebody else will handle the next report. He/she will
look at the code, see absolutely no problems with it, all necessary
checks will be there... Figuring out that a) an assert is the cause, and
b) which one it is, will be a very interesting process...
The corner case is "assert(0)". It means "if the program got to here, the impossible has happened."
It's vaguely defined, overloaded, and not currently treated that way.
Arguably it could mean 'this path won't ever be reached, trust me', but
redefining it now is obviously not possible. (doing this would of course
make assert(0) extremely dangerous)

artur
Andrew Godfrey via Digitalmars-d
2014-07-31 05:15:51 UTC
Permalink
Post by Walter Bright via Digitalmars-d
2. The compiler can make use of assert expressions to improve
optimization, even in -release mode.
For the domain I'm currently working in - a
very large codebase (> 1 MLOC, C/C++) for an application program,
I have to echo what others said, and say I could not use such a
feature.
I think I can add a reason (though what's been said about the
'fuzzy middle'
between assertions and input validation, certainly rings true for
me too).

If my asserts worked this way I would have to stop using them and
build my own.
The reason is that, while I tend to assert only things that
should be true,
this codebase is not well factored and so:
a) we tend to write a lot of assertions, and
b) occasionally we learn something from them (i.e. an assertion
fires,
we go "huh", and our understanding of the codebase improves).

The point is that a priori, we can only guess whether a
particular assertion
we're considering adding is really "this program is screwed if
this condition
is true".

I don't lose sleep over this because it is safe to add our kind
of assertions.
But if adding assertions could affect the optimizer's reasoning,
then it would NOT be safe to add them, and we'd have to back way
off. I'd be comfortable using such assertions only for very
low-level components.

I can see the appeal of allowing the optimizer to do this, but I
don't understand the idea of making that the default behavior. To
me that's like array bounds-checking being off by default. And
speaking of which,
this seems like a useful example:

Surely any program which oversteps the bounds of array, is
incorrect?
It must have made some logic error (be it forgetting to validate
inputs,
or some internal reasoning that was erroneous). So we should put
asserts
on all our array accesses, asserting that they are within bounds!
So... then the optimizer can optimize away all the bounds checks.
Releae
builds need no checks of any kind. Right? :)
I'm not trying to be as facetious as that sounds, I'm saying that
your position seems to me to lead logically to the conclusion
that array bounds-checking
should be off in release.
Walter Bright via Digitalmars-d
2014-07-31 06:57:18 UTC
Permalink
Post by Joseph Rushton Wakeling via Digitalmars-d
My take is that, for this reason, these should be asserts and not enforce()
statements. What are your thoughts on the matter?
An excellent question.

First, note that enforce() returns a recoverable exception, and assert() a
non-recoverable error.

Logically, this means that enforce() is for scrubbing input, and assert() is for
detecting program bugs. I'm pretty brutal in asserting (!) that program bugs are
fatal, non-recoverable errors. I've been fighting this battle for decades, i.e.
repeated proposals by well-meaning programmers who believe their programs can
safely recover from an unknown, invalid state.

Pragmatically, enforce() is significantly more expensive as it uses the GC and
callers must support exception safety. Also, take a look at the quantity
generated code for the enforce() in the examples, and see how expensive it is.
All those lovely messages generate a lot of bloat.

Phobos functions that are designed to scrub input must document so. Otherwise,
they should assert.

For LinearCongruentialEngine() and initialize(), passing invalid arguments are
programming bugs, and so they should be asserting.
Kagamin via Digitalmars-d
2014-07-31 13:10:57 UTC
Permalink
Post by Walter Bright via Digitalmars-d
For LinearCongruentialEngine() and initialize(), passing
invalid arguments are programming bugs, and so they should be
asserting.
Isn't phobos compiled in release mode? And since those asserts
are never compiled, what purpose do they serve?
Jonathan M Davis via Digitalmars-d
2014-07-31 19:31:49 UTC
Permalink
Post by Kagamin via Digitalmars-d
Post by Walter Bright via Digitalmars-d
For LinearCongruentialEngine() and initialize(), passing
invalid arguments are programming bugs, and so they should be
asserting.
Isn't phobos compiled in release mode? And since those asserts
are never compiled, what purpose do they serve?
The whole type is templated, so the assertions will be compiled
in based on whether the user's code is compiled with -released or
not. But it is true that in any non-templated Phobos code,
assertions are only good for checking Phobos during its unit
tests if Phobos is compiled in release mode. IIRC though, there's
a -defaultlib switch (or something close to that) which allows
you to tell the compiler to use a different version of Phobos, so
you can tell it to use a non-release build of Phobos, but I don't
remember if we provide such a build with the released compiler or
not. Regardless, most code probably won't use it, so while it's a
nice idea and useful for those who put in the effort to use it,
it's definitely true that most code won't benefit from it,
seriously reducing the value of any assertions in non-templated
Phobos code which are intended to check user code rather than
Phobos itself.

- Jonathan M Davis
Sean Kelly via Digitalmars-d
2014-07-31 21:21:55 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Post by Joseph Rushton Wakeling via Digitalmars-d
My take is that, for this reason, these should be asserts and
not enforce() statements. What are your thoughts on the
matter?
An excellent question.
First, note that enforce() returns a recoverable exception, and
assert() a non-recoverable error.
Logically, this means that enforce() is for scrubbing input,
and assert() is for detecting program bugs. I'm pretty brutal
in asserting (!) that program bugs are fatal, non-recoverable
errors. I've been fighting this battle for decades, i.e.
repeated proposals by well-meaning programmers who believe
their programs can safely recover from an unknown, invalid
state.
It may be worth drawing a distinction between serial programs
(ie. a typical desktop app) and parallel programs (ie. server
code). For example, in a server program, a logic error caused by
one particular type of request often doesn't invalidate the state
of the entire process. More often, it just prevents further
processing of that one request. Much like how an error in one
thread of a multithreaded program with no shared data does not
corrupt the state of the entire system.

In short, what one terms a "process" is really somewhat fluid.
What is the distinction between a multi-threaded application
without sharing and multiple instances of the same process all
running individually? In Erlang, a "process" is really just a
thread being run by the VM, and each process is effectively a
class instance. All errors are fatal, but they're only fatal for
that one logical process. It doesn't take down the whole VM.

Now in a language like D that allows direct memory access, one
could argue that any logic error may theoretically corrupt the
entire program, and I presume this is where you stand. But more
often in my experience, some artifact of the input data or
environmental factors end up pushing execution through a path
that wasn't adequately tested, and results in a fully recoverable
error (initiated by bad program logic) so long as the error is
detected in a timely manner. These are issues I want caught and
signaled in the most visible manner possible so the logic error
can be fixed, but I don't always want to immediately halt the
entire process and terminate potentially thousands of entirely
correct in-progress transactions.

Perhaps these are all issues that should be marked by enforce
throwing a ProgramLogicException rather than assert with an an
AssertError, but at that point it's almost bikeshedding.
Thoughts?
Daniel Gibson via Digitalmars-d
2014-07-31 21:31:26 UTC
Permalink
Post by Sean Kelly via Digitalmars-d
Post by Walter Bright via Digitalmars-d
Post by Joseph Rushton Wakeling via Digitalmars-d
My take is that, for this reason, these should be asserts and not
enforce() statements. What are your thoughts on the matter?
An excellent question.
First, note that enforce() returns a recoverable exception, and
assert() a non-recoverable error.
Logically, this means that enforce() is for scrubbing input, and
assert() is for detecting program bugs. I'm pretty brutal in asserting
(!) that program bugs are fatal, non-recoverable errors. I've been
fighting this battle for decades, i.e. repeated proposals by
well-meaning programmers who believe their programs can safely recover
from an unknown, invalid state.
It may be worth drawing a distinction between serial programs
(ie. a typical desktop app) and parallel programs (ie. server
code). For example, in a server program, a logic error caused by
one particular type of request often doesn't invalidate the state
of the entire process. More often, it just prevents further
processing of that one request. Much like how an error in one
thread of a multithreaded program with no shared data does not
corrupt the state of the entire system.
In short, what one terms a "process" is really somewhat fluid.
What is the distinction between a multi-threaded application
without sharing and multiple instances of the same process all
running individually? In Erlang, a "process" is really just a
thread being run by the VM, and each process is effectively a
class instance. All errors are fatal, but they're only fatal for
that one logical process. It doesn't take down the whole VM.
Now in a language like D that allows direct memory access, one
could argue that any logic error may theoretically corrupt the
entire program, and I presume this is where you stand. But more
often in my experience, some artifact of the input data or
environmental factors end up pushing execution through a path
that wasn't adequately tested, and results in a fully recoverable
error (initiated by bad program logic) so long as the error is
detected in a timely manner. These are issues I want caught and
signaled in the most visible manner possible so the logic error
can be fixed, but I don't always want to immediately halt the
entire process and terminate potentially thousands of entirely
correct in-progress transactions.
Perhaps these are all issues that should be marked by enforce
throwing a ProgramLogicException rather than assert with an an
AssertError, but at that point it's almost bikeshedding.
Thoughts?
I agree.
Also: During development I'd be fine with this terminating the program
(ideally dumping the core) so the error is immediately noticed, but in
release mod I'm not, so a construct that halts in debug mode and throws
an exception or maybe just returns false (or one construct for each of
the cases) in release mode would be helpful.

Cheers,
Daniel
simendsjo via Digitalmars-d
2014-07-31 07:01:37 UTC
Permalink
On 07/31/2014 12:01 AM, Walter Bright wrote:
(...)
Post by Walter Bright via Digitalmars-d
2. The compiler can make use of assert expressions to improve
optimization, even in -release mode.
(...)

Does this mean that assertions used for optimization will be left in
-release? There's plenty of times where I've had an old incorrect
assertion in my code after a refactoring - sometimes in seldom used paths.

If the compiler would aggressively optimize my code based on some wrong
assumptions I give it, it would be very useful if that assumption would
stay in the code to trigger an assertion.
Walter Bright via Digitalmars-d
2014-07-31 07:36:46 UTC
Permalink
Post by simendsjo via Digitalmars-d
(...)
Post by Walter Bright via Digitalmars-d
2. The compiler can make use of assert expressions to improve
optimization, even in -release mode.
(...)
Does this mean that assertions used for optimization will be left in
-release? There's plenty of times where I've had an old incorrect
assertion in my code after a refactoring - sometimes in seldom used paths.
If the compiler would aggressively optimize my code based on some wrong
assumptions I give it, it would be very useful if that assumption would
stay in the code to trigger an assertion.
To get this behavior, don't use the -release switch.
David Bregman via Digitalmars-d
2014-07-31 07:40:13 UTC
Permalink
Post by Walter Bright via Digitalmars-d
I'd like to sum up my position and intent on all this.
1. I can discern no useful, practical difference between the
notions of assume and assert.
People have explained the difference repeatedly, a ridiculous
number of times now. Could you please take a minute to understand
it this time instead of flippantly dismissing it again?

assert does a runtime check, assume does not
assume affects code generation/optimization, assert does not
assert is for debugging, assume is not
assume is for optimization, assert is not

In terms of what they practically do, they have *nothing* in
common, their functions are entirely orthogonal.

Still think there is no practical difference?
Post by Walter Bright via Digitalmars-d
2. The compiler can make use of assert expressions to improve
optimization, even in -release mode.
This will introduce a lot of undefined behavior, including making
@safe code with asserts unsafe. I really think this needs to be
acknowledged. As far as I can tell from the other thread, it
still hasn't been.
Walter Bright via Digitalmars-d
2014-07-31 08:08:47 UTC
Permalink
Post by Walter Bright via Digitalmars-d
I'd like to sum up my position and intent on all this.
1. I can discern no useful, practical difference between the notions of assume
and assert.
People have explained the difference repeatedly, a ridiculous number of times
now. Could you please take a minute to understand it this time instead of
flippantly dismissing it again?
assert does a runtime check, assume does not
assume affects code generation/optimization, assert does not
assert is for debugging, assume is not
assume is for optimization, assert is not
In terms of what they practically do, they have *nothing* in common, their
functions are entirely orthogonal.
They are inextricably entangled. Consider:

if (x == 0) abort(); // essentially what assert(x) does
... at this point, the optimizer knows, beyond doubt, that x!=0 ...
if (x) // optimizer can remove this check
...

which has the behavior of assume as you listed above, yet it is assert. We can
pretend assert doesn't affect code, like we can pretend to have massless points
in physics class, but in reality points have a mass and assert most definitely
affects code generation, and does in every compiler I've checked, and it affects
it in just the way the assume does.
Still think there is no practical difference?
Yes.
Post by Walter Bright via Digitalmars-d
2. The compiler can make use of assert expressions to improve optimization,
even in -release mode.
with asserts unsafe. I really think this needs to be acknowledged. As far as I
can tell from the other thread, it still hasn't been.
I did acknowledge it for the array bounds case.

Note that your assume() will have the same effect, and worse, there will be no
option to have the compiler insert a check, because then it would be an assert()
and you might as well just use assert().

This is why I see the distinction as being pointless.
Tobias Pankrath via Digitalmars-d
2014-07-31 08:27:09 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Post by David Bregman via Digitalmars-d
In terms of what they practically do, they have *nothing* in
common, their
functions are entirely orthogonal.
if (x == 0) abort(); // essentially what assert(x) does
... at this point, the optimizer knows, beyond doubt, that
x!=0 ...
if (x) // optimizer can remove this check
...
As far as I unterstand, this would be the behaviour without
-release. With -release the code becomes

if(x)
...

and the optimizer cannot remove the (second) check. Or am I
missing something?
Walter Bright via Digitalmars-d
2014-07-31 09:11:32 UTC
Permalink
Post by Walter Bright via Digitalmars-d
In terms of what they practically do, they have *nothing* in common, their
functions are entirely orthogonal.
if (x == 0) abort(); // essentially what assert(x) does
... at this point, the optimizer knows, beyond doubt, that x!=0 ...
if (x) // optimizer can remove this check
...
As far as I unterstand, this would be the behaviour without -release. With
-release the code becomes
if(x)
...
and the optimizer cannot remove the (second) check. Or am I missing something?
My intention is that the runtime check would be omitted, but the information
would still be fed to the optimizer. This is not currently implemented.
David Bregman via Digitalmars-d
2014-07-31 11:28:40 UTC
Permalink
Post by Walter Bright via Digitalmars-d
On Wednesday, 30 July 2014 at 22:01:23 UTC, Walter Bright
Post by Walter Bright via Digitalmars-d
I'd like to sum up my position and intent on all this.
1. I can discern no useful, practical difference between the
notions of assume
and assert.
People have explained the difference repeatedly, a ridiculous
number of times
now. Could you please take a minute to understand it this time instead of
flippantly dismissing it again?
assert does a runtime check, assume does not
assume affects code generation/optimization, assert does not
assert is for debugging, assume is not
assume is for optimization, assert is not
In terms of what they practically do, they have *nothing* in
common, their
functions are entirely orthogonal.
if (x == 0) abort(); // essentially what assert(x) does
... at this point, the optimizer knows, beyond doubt, that
x!=0 ...
if (x) // optimizer can remove this check
...
which has the behavior of assume as you listed above, yet it is
assert. We can pretend assert doesn't affect code, like we can
pretend to have massless points in physics class, but in
reality points have a mass and assert most definitely affects
code generation, and does in every compiler I've checked, and
it affects it in just the way the assume does.
Still think there is no practical difference?
Yes.
Sigh. Of course you can assume the condition after a runtime
check has been inserted. You just showed that

assert(x); assume(x);

is semantically equivalent to
assert(x);

as long as the runtime check is not elided. (no -release)

You didn't show that assert and assume are the same, they are not.

The code generated by one will be different than the code
generated by the other, that is because they are functionally
different. This is really indisputable..
Post by Walter Bright via Digitalmars-d
Post by Walter Bright via Digitalmars-d
2. The compiler can make use of assert expressions to improve optimization,
even in -release mode.
This will introduce a lot of undefined behavior, including
with asserts unsafe. I really think this needs to be
acknowledged. As far as I
can tell from the other thread, it still hasn't been.
I did acknowledge it for the array bounds case.
Ok, thanks! But you still want to assert to become assume in
release mode? How will you handle the safety issue?
Post by Walter Bright via Digitalmars-d
Note that your assume() will have the same effect, and worse,
there will be no option to have the compiler insert a check,
because then it would be an assert() and you might as well just
use assert().
So what? I did not suggest to use assume() instead of assert() to
avoid the problem. In fact, that _is_ the problem, _you_ are
suggesting that assert becomes assume in release mode. assume()
is not @safe, that is the whole point.
Walter Bright via Digitalmars-d
2014-07-31 18:58:06 UTC
Permalink
Sigh. Of course you can assume the condition after a runtime check has been
inserted. You just showed that
assert(x); assume(x);
is semantically equivalent to
assert(x);
as long as the runtime check is not elided. (no -release)
No. I showed that you cannot have an assert without the assume. That makes them
equivalent that direction.

For the other direction, adding in a runtime check for an assume is going to be
expected of an implementation. And, in fact, since the runtime check won't
change the semantics if the assume is correct, they are equivalent.

I.e. for practical purposes, they are the same thing. You can't have one without
the other.
The code generated by one will be different than the code generated by the
other, that is because they are functionally different. This is really
indisputable..
Oh, I dispute it very much!
But you still want to assert to become assume in release mode? How
will you handle the safety issue?
I don't know yet.
So what?
It came up in the thread about assume vs assert. I assumed (!) it mattered to you.
Timon Gehr via Digitalmars-d
2014-07-31 19:53:42 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Sigh. Of course you can assume the condition after a runtime check has been
inserted. You just showed that
assert(x); assume(x);
is semantically equivalent to
assert(x);
as long as the runtime check is not elided. (no -release)
No. I showed that you cannot have an assert without the assume.
No you did not. However:

* You showed that an additional 'assume' would not have any effect if
the check is never elided.

* You showed that the state of knowledge about the program state of the
optimizer are the same after processing a halting runtime check and
after processing an 'assume'.

I don't think anybody is contesting that. Now try to zoom your focus out
a little, and think about _what if_ the assertion and the assumption are
actually wrong? Why does it make sense to conflate them in this case?
Post by Walter Bright via Digitalmars-d
That makes them equivalent that direction.
For the other direction, adding in a runtime check for an assume is
going to be expected of an implementation.
Yes if 'assert' does what 'assert' does now, and if 'assume' does what
'assert' does now, then 'assert' and 'assume' do the same. I agree with
that, but the premise is unrelated to this discussion. You are moving
the goal posts.
Post by Walter Bright via Digitalmars-d
And, in fact, since the
runtime check won't change the semantics if the assume is correct, they
are equivalent.
...
"If the 'assume'/'assert' are correct" is not a sound assumption to
make. You are not the compiler, you are the programmer. We are
discussing _about_ programs, not _within_ programs.
Post by Walter Bright via Digitalmars-d
I.e. for practical purposes, they are the same thing.
All assertions being correct is not a given 'for practical purposes'.
You are arguing in the context of a theoretical ideal and this context
alone.
David Bregman via Digitalmars-d
2014-07-31 22:07:56 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Post by David Bregman via Digitalmars-d
Sigh. Of course you can assume the condition after a runtime
check has been
inserted. You just showed that
assert(x); assume(x);
is semantically equivalent to
assert(x);
as long as the runtime check is not elided. (no -release)
No. I showed that you cannot have an assert without the assume.
That makes them equivalent that direction.
That is only true if assert always generates a runtime check.
i.e. it is not true for C/C++ assert (and so far, D assert) in
release mode.
Post by Walter Bright via Digitalmars-d
For the other direction, adding in a runtime check for an
assume is going to be expected of an implementation.
No. It is expected that assume does /not/ have a runtime check.
Assume is used to help the compiler optimize based on trusted
facts, doing a runtime check could easily defeat the purpose of
such micro optimizations.
Post by Walter Bright via Digitalmars-d
And, in fact, since the runtime check won't change the
semantics if the assume is correct, they are equivalent.
Right, only "if the assume is correct". So they aren't equivalent
if it isn't correct.

Q.E.D. ?
Post by Walter Bright via Digitalmars-d
Post by David Bregman via Digitalmars-d
But you still want to assert to become assume in release mode? How
will you handle the safety issue?
I don't know yet.
I would think the easiest way is to just not inject the
assumption when inside @safe code, but I don't know anything
about the compiler internals.

Even for @system code, I'm on the fence about whether asserts
should affect codegen in release, it doesn't seem like a clear
tradeoff to make: safety vs some dubious optimization gains. Do
we really want to go down the same road as C with undefined
behavior?

I would need to think about it more, but if D adopted that route,
I would at least feel like I need to be much more careful with
asserts, so I'm not accidentally making my code more buggy
instead of less. I think it warrants discussion, anyways.
Walter Bright via Digitalmars-d
2014-08-01 01:19:59 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Sigh. Of course you can assume the condition after a runtime check has been
inserted. You just showed that
assert(x); assume(x);
is semantically equivalent to
assert(x);
as long as the runtime check is not elided. (no -release)
No. I showed that you cannot have an assert without the assume. That makes
them equivalent that direction.
That is only true if assert always generates a runtime check. i.e. it is not
true for C/C++ assert (and so far, D assert) in release mode.
Post by Walter Bright via Digitalmars-d
For the other direction, adding in a runtime check for an assume is going to
be expected of an implementation.
No. It is expected that assume does /not/ have a runtime check. Assume is used
to help the compiler optimize based on trusted facts, doing a runtime check
could easily defeat the purpose of such micro optimizations.
I'm rather astonished you'd take that position. It opens a huge door wide for
undefined behavior, and no obvious way of verifying that the assume() is correct.

I'm confident that if D introduced such behavior, the very first comment would
be "I need it to insert a runtime check on demand."
Post by Walter Bright via Digitalmars-d
And, in fact, since the runtime check won't change the semantics if the assume
is correct, they are equivalent.
Right, only "if the assume is correct". So they aren't equivalent if it isn't
correct.
Q.E.D. ?
I'm not buying those uncheckable semantics as being workable and practical.
Post by Walter Bright via Digitalmars-d
But you still want to assert to become assume in release mode? How
will you handle the safety issue?
I don't know yet.
I would think the easiest way is to just not inject the assumption when inside
@safe code, but I don't know anything about the compiler internals.
codegen in release, it doesn't seem like a clear tradeoff to make: safety vs
some dubious optimization gains.
So why do you want assume() with no checking whatsoever? Does anybody want that?
Why are we even discussing such a misfeature?
Do we really want to go down the same road as C
with undefined behavior?
So you don't want assume()? Who does?
Daniel Murphy via Digitalmars-d
2014-07-31 08:23:44 UTC
Permalink
Post by Walter Bright via Digitalmars-d
5. assert(0); is equivalent to a halt, and the compiler won't remove it.
This is not the same definition the spec gives. The spec says assert(0) can
be treated as unreachable, and the compiler is allowed to optimize
accordingly.

The difference is that in this code:

if (cond)
assert(0);

With your above definition cond will be evaluated, while with the spec's
more powerful definition it may be skipped.
Walter Bright via Digitalmars-d
2014-07-31 09:13:50 UTC
Permalink
Post by Walter Bright via Digitalmars-d
5. assert(0); is equivalent to a halt, and the compiler won't remove it.
This is not the same definition the spec gives. The spec says assert(0) can be
treated as unreachable, and the compiler is allowed to optimize accordingly.
It says more than that:

"The expression assert(0) is a special case; it signifies that it is unreachable
code. Either AssertError is thrown at runtime if it is reachable, or the
execution is halted (on the x86 processor, a HLT instruction can be used to halt
execution). The optimization and code generation phases of compilation may
assume that it is unreachable code."

-- http://dlang.org/expression.html#AssertExpression
ponce via Digitalmars-d
2014-07-31 10:24:06 UTC
Permalink
Post by Walter Bright via Digitalmars-d
"Walter Bright" wrote in message
Post by Walter Bright via Digitalmars-d
5. assert(0); is equivalent to a halt, and the compiler won't remove it.
This is not the same definition the spec gives. The spec says assert(0) can be
treated as unreachable, and the compiler is allowed to
optimize accordingly.
"The expression assert(0) is a special case; it signifies that
it is unreachable code. Either AssertError is thrown at runtime
if it is reachable, or the execution is halted (on the x86
processor, a HLT instruction can be used to halt execution).
The optimization and code generation phases of compilation may
assume that it is unreachable code."
-- http://dlang.org/expression.html#AssertExpression
You said "the compiler won't remove it".
http://dlang.org/expression.html#AssertExpression says: "The
optimization and code generation phases of compilation may assume
that it is unreachable code."

Who is right?

If I write:

---
switch(expr())
{
case 0: doIt();
case 1: doThat();
default:
assert(0);
}
---

Will the optimizer be able to remove the default: case?
Because If I use assert(0) it's on purpose and do not want it to
be elided, ever.
MSVC has __assume(0); for unreachable code, GCC has
__builtin_unreachable()
via Digitalmars-d
2014-07-31 11:01:54 UTC
Permalink
Post by ponce via Digitalmars-d
---
switch(expr())
{
case 0: doIt();
case 1: doThat();
assert(0);
}
---
Will the optimizer be able to remove the default: case?
Assuming fall-through (`goto case`), not only the default case.
The entire switch could be removed, under the condition that the
compiler can prove that neither `expr()`, `doIt()`, nor
`doThat()` throws, even if they have side effects. And maybe even
the entire function, and all functions that call it, depending on
how exactly the control flow is.
ponce via Digitalmars-d
2014-07-31 13:49:49 UTC
Permalink
Post by via Digitalmars-d
Post by ponce via Digitalmars-d
---
switch(expr())
{
case 0: doIt();
case 1: doThat();
assert(0);
}
---
Will the optimizer be able to remove the default: case?
Assuming fall-through (`goto case`), not only the default case.
The entire switch could be removed, under the condition that
the compiler can prove that neither `expr()`, `doIt()`, nor
`doThat()` throws, even if they have side effects. And maybe
even the entire function, and all functions that call it,
depending on how exactly the control flow is.
Ok my example was wrong, I meant:

---
switch(expr())
{
case 0: doIt(); break;
case 1: doThat(); break;
default:
assert(0);
break;
}
---
Walter Bright via Digitalmars-d
2014-07-31 19:03:08 UTC
Permalink
Post by ponce via Digitalmars-d
Post by Walter Bright via Digitalmars-d
Post by Walter Bright via Digitalmars-d
5. assert(0); is equivalent to a halt, and the compiler won't remove it.
This is not the same definition the spec gives. The spec says assert(0) can be
treated as unreachable, and the compiler is allowed to optimize accordingly.
"The expression assert(0) is a special case; it signifies that it is
unreachable code. Either AssertError is thrown at runtime if it is reachable,
or the execution is halted (on the x86 processor, a HLT instruction can be
used to halt execution). The optimization and code generation phases of
compilation may assume that it is unreachable code."
-- http://dlang.org/expression.html#AssertExpression
You said "the compiler won't remove it".
Right, and it doesn't.
Post by ponce via Digitalmars-d
http://dlang.org/expression.html#AssertExpression says: "The optimization and
code generation phases of compilation may assume that it is unreachable code."
Who is right?
It means if the control flow does actually get there, a HALT is executed.
Timon Gehr via Digitalmars-d
2014-07-31 19:11:54 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Post by ponce via Digitalmars-d
Post by Walter Bright via Digitalmars-d
Post by Walter Bright via Digitalmars-d
5. assert(0); is equivalent to a halt, and the compiler won't remove it.
This is not the same definition the spec gives. The spec says assert(0) can be
treated as unreachable, and the compiler is allowed to optimize accordingly.
"The expression assert(0) is a special case; it signifies that it is
unreachable code. Either AssertError is thrown at runtime if it is reachable,
or the execution is halted (on the x86 processor, a HLT instruction can be
used to halt execution). The optimization and code generation phases of
compilation may assume that it is unreachable code."
-- http://dlang.org/expression.html#AssertExpression
You said "the compiler won't remove it".
Right, and it doesn't.
Post by ponce via Digitalmars-d
http://dlang.org/expression.html#AssertExpression says: "The
optimization and
code generation phases of compilation may assume that it is
unreachable code."
Who is right?
It means if the control flow does actually get there, a HALT is executed.
And assuming control flow does not actually get there?
Jonathan M Davis via Digitalmars-d
2014-07-31 19:54:46 UTC
Permalink
Post by Timon Gehr via Digitalmars-d
Post by Walter Bright via Digitalmars-d
It means if the control flow does actually get there, a HALT
is executed.
And assuming control flow does not actually get there?
Then the HALT instruction is never hit. The compiler would have
to be able to prove that reaching the HALT instruction was
impossible in order to remove it (in which case, I would assume
that it would remove it, but I wouldn't expect that to happen
very often).

- Jonathan M Davis
David Nadlinger via Digitalmars-d
2014-07-31 20:36:56 UTC
Permalink
Post by Jonathan M Davis via Digitalmars-d
Then the HALT instruction is never hit. The compiler would have
to be able to prove that reaching the HALT instruction was
impossible in order to remove it [
]
Well, no, as the portion of the spec Walter quoted specifically
states that the compiler can *assume* the assert(0) to be
unreachable. If you can assume something, you don't have to prove
it any longer for any sane definition of "assume".

If this is not something we want, we need to fix the spec instead
of trying to argue around the problem.

David
Timon Gehr via Digitalmars-d
2014-07-31 21:09:22 UTC
Permalink
Post by Jonathan M Davis via Digitalmars-d
Post by Timon Gehr via Digitalmars-d
Post by Walter Bright via Digitalmars-d
It means if the control flow does actually get there, a HALT is executed.
And assuming control flow does not actually get there?
Then the HALT instruction is never hit.
Indeed. Now note that the compiler is arguing from the same standpoint
that you were just now.
Post by Jonathan M Davis via Digitalmars-d
The compiler would have to be
able to prove that reaching the HALT instruction was impossible in order
to remove it
The compiler is able to prove this immediately. The goal it needs to
prove ('control flow does not reach assert(0)') is an assumption made by
Walter and the language specification and readily available.

Note that I'd be happier with the state of affairs if you were right
about this.
ponce via Digitalmars-d
2014-07-31 19:57:20 UTC
Permalink
Post by Walter Bright via Digitalmars-d
It means if the control flow does actually get there, a HALT is executed.
Fine.
David Nadlinger via Digitalmars-d
2014-07-31 20:33:49 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Post by ponce via Digitalmars-d
Post by Walter Bright via Digitalmars-d
"The expression assert(0) is a special case; it signifies
that it is
unreachable code. Either AssertError is thrown at runtime if
it is reachable,
or the execution is halted (on the x86 processor, a HLT
instruction can be
used to halt execution). The optimization and code generation phases of
compilation may assume that it is unreachable code."
-- http://dlang.org/expression.html#AssertExpression
You said "the compiler won't remove it".
Right, and it doesn't.
This is in direct contradiction to the quoted spec excerpt. If
the backend can assume that something is unreachable code, why on
earth should it need to actually emit that code? A small example:

---
void foo(int a) {
if (a == 42) assert(0);
// Do something else.
}
---

If the compiler is free to assume that the assert is unreachable,
please explain to me what stops it from inferring that the branch
is never taken and transforming the example to the equivalent of:

---
void foo(int a) {
// Do something else.
}
---

LDC would do this today if we implemented the regarding assuming
unreachability (we currently emit a halt – actually a ud2 trap on
x86 – instead).

I've had the questionable pleasure of tracking down a couple of
related issues in LLVM and the LDC codegen, so please take my
word for it: Requiring any particular behavior such as halting in
a case that can be assumed to be unreachable is at odds with how
the term "unreachable" is used in the wild – at least in projects
like GCC and LLVM.

Cheers,
David
Walter Bright via Digitalmars-d
2014-07-31 21:25:19 UTC
Permalink
I've had the questionable pleasure of tracking down a couple of related issues
in LLVM and the LDC codegen, so please take my word for it: Requiring any
particular behavior such as halting in a case that can be assumed to be
unreachable is at odds with how the term "unreachable" is used in the wild – at
least in projects like GCC and LLVM.
For example:

int foo() {
while (...) {
...
}
assert(0);
}

the compiler needn't issue an error at the end "no return value for foo()"
because it can assume it never got there.

I'll rewrite that bit in the spec as it is clearly causing confusion.
Timon Gehr via Digitalmars-d
2014-07-31 22:04:50 UTC
Permalink
Post by Walter Bright via Digitalmars-d
I'll rewrite that bit in the spec as it is clearly causing confusion.
A wording that avoids all those issues would be something like:
"'assert(0)' never returns and hence terminates the basic block it
occurs in."
David Nadlinger via Digitalmars-d
2014-07-31 22:17:55 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Post by David Nadlinger via Digitalmars-d
I've had the questionable pleasure of tracking down a couple
of related issues
Requiring any
particular behavior such as halting in a case that can be
assumed to be
unreachable is at odds with how the term "unreachable" is used in the wild – at
least in projects like GCC and LLVM.
int foo() {
while (...) {
...
}
assert(0);
}
the compiler needn't issue an error at the end "no return value
for foo()" because it can assume it never got there.
I'll rewrite that bit in the spec as it is clearly causing
confusion.
Don't rewrite it because you merely concede that it might be
confusing. Rewrite it because you admit it's contradictory. If
you just try to reword the spec without understanding how your
use of the terminology differs from the established meaning,
you'll probably come up with something that is confusing to the
rest of the world just as well.

Perhaps looking at the situation in terms of basic blocks and the
associated control flow graph will help:

As per your above post, assert(0) has nothing to do with making
any assumptions on the compiler side. It merely servers as a
terminator instruction of a BB, making it a leaf in the CFG. This
seems to be the definition you intend for the spec. Maybe add
something along the lines of "behaves like a function call that
never returns" as an explanation to make it easier to understand.

This is not what "unreachable" means. If assert(0) was
unreachable, then the compiler would be free to assume that no
CFG edges *into* the BB holding the instruction are ever taken
(and as a corollary, it could also decide not emit any code for
it). Thus, the term certainly shouldn't appear anywhere near
assert(0) in the spec, except to point out the difference.

Cheers,
David
David Nadlinger via Digitalmars-d
2014-07-31 22:26:15 UTC
Permalink
Post by David Nadlinger via Digitalmars-d
servers
Gah, "serves". Also, I hope the post didn't come across as
condescending, as it certainly wasn't intended that way. I just
figured it would be a good idea to define the terms we are using,
as we seemed to be continuously talking past each other.

David
Sean Kelly via Digitalmars-d
2014-07-31 20:52:29 UTC
Permalink
Post by Walter Bright via Digitalmars-d
3. Use of assert to validate input is utterly wrong and will
not be supported. Use such constructs at your own risk.
...
Post by Walter Bright via Digitalmars-d
6. enforce() is meant to check for input errors (environmental
errors are considered input).
7. using enforce() to check for program bugs is utterly wrong.
enforce() is a library creation, the core language does not
recognize it.
Could you expand on what you consider input? For example, if a
function has an "in" contract that validates input parameters, is
the determination that a parameter is invalid a program bug or
simply invalid input? If you consider this invalid input that
should be checked by enforce(), can you explain why?
ponce via Digitalmars-d
2014-07-31 21:01:41 UTC
Permalink
Post by Sean Kelly via Digitalmars-d
Post by Walter Bright via Digitalmars-d
3. Use of assert to validate input is utterly wrong and will
not be supported. Use such constructs at your own risk.
...
Post by Walter Bright via Digitalmars-d
6. enforce() is meant to check for input errors (environmental
errors are considered input).
7. using enforce() to check for program bugs is utterly wrong.
enforce() is a library creation, the core language does not
recognize it.
Could you expand on what you consider input? For example, if a
function has an "in" contract that validates input parameters,
is
the determination that a parameter is invalid a program bug or
simply invalid input? If you consider this invalid input that
should be checked by enforce(), can you explain why?
This also puzzles me. There is the point where the two types of
errors blend to the point of being uncomfortable.

Eg: a program generates files in X format and can also read them
with a X parser. Its X parser will only ever read output
generated by itself. Should input errors in X parser be checked
with assert or exceptions?
Timon Gehr via Digitalmars-d
2014-07-31 21:39:48 UTC
Permalink
Post by Sean Kelly via Digitalmars-d
Could you expand on what you consider input? For example, if a
function has an "in" contract that validates input parameters, is
the determination that a parameter is invalid a program bug or
simply invalid input? ...
The assertions in an 'in' contracts are obligations at the call site.

I.e., the code:

void baz(int x){
assert(x>2);
// ...
}

is buggy.

The code

void foo(int x)in{ assert(x>2); }body{ assert(x>2); }

is correct.

The code:

void bar(int x){ foo(x); }

is buggy.

The code

void bar(int x){
enforce(x>2);
foo(x);
}

is fine.
Post by Sean Kelly via Digitalmars-d
This also puzzles me. There is the point where the two types of errors
blend to the point of being uncomfortable.
Eg: a program generates files in X format and can also read them with a
X parser. Its X parser will only ever read output generated by itself.
Should input errors in X parser be checked with assert or exceptions?
Use 'assert' for checking things which you expect to be true. But don't
be fooled, besides having some loosely checked code documentation, the
main reason to write down assertions is that they will occasionally fail
and tell you something interesting which you didn't know yet about your
program (as it was nicely formulated somewhere else in this thread.)
This makes assertions seem a little schizophrenic at times, but after
all, checking those assertions, which you _expect_, by definition, to be
no-ops anyway, may be too expensive for you and then they can be
disabled. I'd check what are the actual performance gains before doing
this though. You can also control assertions in a more fine-grained way
by guarding them with version statements.

Furthermore, use 'assert' as well in _in_ contracts, and think about
them being checked in the context of the caller as an obligation instead
of as being checked in your own function.

Use 'enforce' for things which you expect might be false sometimes and
that you may want to handle as an exception in this case.

I wouldn't think about 'enforce' as 'not checking program bugs' too hard
though. Maybe the bug is in code which does not actually expect the
exceptional path to be taken.

The difference between

void foo1(int x)in{ assert(x>2); }body{ ... }

and

void foo2(int x){
enforce(x>2);
}

Is that 'foo2' reliably throws an exception if the contents of x are not
greater than 2; this is part of it's behaviour and would be part of its
documentation, while 'foo1' just states in its documentation that it
will do a certain thing _provided_ x is greater than 2, and that its
behaviour is left unspecified otherwise.


Or that's what I would say anyway if there wasn't talk about turning
assertions into assumptions in -release. If nothing changes, always use
version(assert) to guard your assert statements unless you want their
failures to be undefined behaviour in -release mode.
Timon Gehr via Digitalmars-d
2014-07-31 21:40:25 UTC
Permalink
Post by Timon Gehr via Digitalmars-d
it's behaviour
gah.
Walter Bright via Digitalmars-d
2014-08-01 01:27:56 UTC
Permalink
This also puzzles me. There is the point where the two types of errors blend to
the point of being uncomfortable.
Eg: a program generates files in X format and can also read them with a X
parser. Its X parser will only ever read output generated by itself. Should
input errors in X parser be checked with assert or exceptions?
Exceptions. Although there are grey areas, this is not one. Filesystems are
subject to all kinds of failures, exhaustions, modification by other processes,
etc., which are not logic bugs in your program.


If you're brave and want to have some fun, fill up your hard disk so it is
nearly full. Now run your favorite programs that read and write files. Sit back
and watch the crazy results (far too many programs assume that writes succeed).
Operating systems also behave erratically in this scenario, hence the 'brave'
suggestion.

Walter Bright via Digitalmars-d
2014-07-31 21:11:11 UTC
Permalink
Post by Sean Kelly via Digitalmars-d
Could you expand on what you consider input?
All state processed by the program that comes from outside the program. That
would include:

1. user input
2. the file system
3. uninitialized memory
4. interprocess shared memory
5. anything received from system APIs, device drivers, and DLLs that are not
part of the program
6. resource availability and exhaustion
Post by Sean Kelly via Digitalmars-d
For example, if a
function has an "in" contract that validates input parameters, is
the determination that a parameter is invalid a program bug or
simply invalid input?
An "in" contract failure is a program bug. Contracts are ASSERTIONS ABOUT THE
CORRECTNESS OF THE PROGRAM LOGIC. They are not assertions about the program's input.
Post by Sean Kelly via Digitalmars-d
If you consider this invalid input that
should be checked by enforce(), can you explain why?
This says it better than I can:

http://en.wikipedia.org/wiki/Design_by_contract
Sean Kelly via Digitalmars-d
2014-07-31 21:29:57 UTC
Permalink
Post by Walter Bright via Digitalmars-d
Post by Sean Kelly via Digitalmars-d
Could you expand on what you consider input?
All state processed by the program that comes from outside the
1. user input
2. the file system
3. uninitialized memory
4. interprocess shared memory
5. anything received from system APIs, device drivers, and DLLs
that are not part of the program
6. resource availability and exhaustion
So effectively, any factor occurring at runtime. If I create a
library, it is acceptable to validate function parameters using
assert() because the user of that library knows what the library
expects and should write their code accordingly. That's fair.
Timon Gehr via Digitalmars-d
2014-07-31 22:10:37 UTC
Permalink
Post by Sean Kelly via Digitalmars-d
Post by Walter Bright via Digitalmars-d
Post by Sean Kelly via Digitalmars-d
Could you expand on what you consider input?
All state processed by the program that comes from outside the
1. user input
2. the file system
3. uninitialized memory
4. interprocess shared memory
5. anything received from system APIs, device drivers, and DLLs that
are not part of the program
6. resource availability and exhaustion
So effectively, any factor occurring at runtime. If I create a
library, it is acceptable to validate function parameters using
assert() because the user of that library knows what the library
expects and should write their code accordingly. That's fair.
It is most fair inside the 'in' contract.
Continue reading on narkive:
Loading...