The "how likely is it, really?" response to questions of technical correctness has always bothered me. It takes a mindset completely alien to mine to say "Here's a race condition. Sure, it's undefined behavior, but the race is narrow, so it's rare" or to say "Sure, memory allocation can theoretically fail, but in practice almost never does" or to say "fsync is too slow and most computers have batteries these days".
Software is unreliable enough as it is due to problems beneath our notice. It seems reckless to avoid fixing problems that we do notice. Sure, you could argue that rare problems are rare and that users probably won't notice them --- this attitude is penny-wise and pound-foolish, because you can't meaningfully reason about a system that's only probably correct.
Engineering is about tradeoffs. How many once-in-a-thousand bugs do you fix before you tackle the one-in-a-million? Or one in a billion? What about if it takes $10/bug to fix every 1:1000 bug and $100,000 to fix one 1:1000000 bug?
Correctness is great in theory, but in practice it's a matter of what's important.
This is really emphasized in things like dmfea and other failure mode analysis documents or regulated industry. They want you to document the likelihood, your ability to recover from the failure, as well as the cost o the failure. You can say that you didn't want to pay for someone fixing some unlikely fail mode but that's small consultation to the people whose lives your product is ruining.
The problem you're latching on to I think is how the context for caculating a probability can vary.
If it were really as likely as, say, the sun exploding that X happened then it would be of no use to expend time on X.
BUT very often people speaking about the probability of events given suspicious constraints. While a memory allocation might not fail in most situations it will fail often in some situations. And a one-in-a-million chance is almost guaranteed when there are millions of uses.
Also worth considering that our processors are handling billions of ops per second. One in a million might be happening all the time even for one user.
One in a million isn't just a typical statement of probability, it's a colloquialism used to refer to things that never happen in practice. It's highly misleading to use in the context of computers which, due to their natures, have one in a million events occurring constantly.
>he "how likely is it, really?" response to questions of technical correctness has always bothered me.
But the question is important in another context: language design. Why is this undefined behavior something that exists in the first place? Objects larger than PTRDIFF_MAX could just not be allowed! This avoids the problem and makes code easier to reason about, with pretty much no downside.
I like the way you're thinking, but that sort of thing probably doesn't get past a committee. "Hey we might not be able to think of an application but that doesn't mean our users won't have a legitimate reason for doing it ... Motion passed."
A few months ago I was doing FFTs on arrays larger than 4GB. Amusingly, this uncovered a bug in the LLVM optimizer: It was looking at stride lengths to figure out if accesses were independent, and truncated a 4GB stride down to 0.
I would be extremely interested to hear how you found this bug. Sounds like a difficult bug to track down, and I always learn from good debugging stories.
It was pretty easy to track down: clang38 was exiting with
Assertion failed: (Distance > 0 && "The distance must be non-zero"),
function areStridedAccessesIndependent, file /wrkdirs/usr/ports/devel/
llvm38/work/llvm-3.8.0.src/lib/Analysis/LoopAccessAnalysis.cpp, line 1004.
Looking at the file it was easy to see what was being asserted, and to see that the type was a 32-bit integer; since I knew I was dealing with huge FFTs, the problem was obvious.
Let this be a lesson: Asserting that impossible things don't happen makes debugging much easier when they do happen!
Not likely, but possible. This reminds me of the bug that was found in the binary search algorithm a few years ago, IIRC, in Java. The interesting thing is that binary search is probably one of the earliest-invented algorithms. Yet, in the book Writing Efficient Programs by Jon Bentley (which I mentioned in a recent HN comment), he says that in a class he taught to several industrial programmers with many years of experience, some had bugs in their implementations of binary search that he set them as an exercise. Not sure but I think I remember reading in the article about the Java binary search issue, that even his algorithm had the bug that was found in the Java version. Why it was not found earlier is (maybe) because it only occurred with an extremely large array, IIRC. Don't have a link right now, but it can probably be found by searching for the right phrase.
Just did a google search, and it even partially auto-completed this search for me:
bug in java binary search
and showed a related search in the drop-down, 'programming pearls ...', a book by Jon Bentley, which seems to confirm what I said above (though I saw it in his other book, "Efficient Programs", IIRC - he might have mentioned the same issue in the Programming Pearls book too).
It's basically bogus to have a single object bigger or equal to half of address space (represented by size_t) in C. 32-bit platforms should detect and abort in such conditions (compiler/linker for static objects, malloc() implementation for dynamic allocations).
It's either addressable or it isn't. My understanding of typical PAE systems is that userspace is still limited to 32 bits of address space per process. Any system where userspace is not limited to 32 bits should have a larger than 32-bit size_t. (PAE systems are not true 32-bit platforms.)
Probably not very likely, but keep in mind that this method could also be used without actually allocating the array -- akin to the 'offsetof()' macro. (Which is undefined behavior.)
Software is unreliable enough as it is due to problems beneath our notice. It seems reckless to avoid fixing problems that we do notice. Sure, you could argue that rare problems are rare and that users probably won't notice them --- this attitude is penny-wise and pound-foolish, because you can't meaningfully reason about a system that's only probably correct.