Hacker News new | ask | show | jobs
by GuestHNUser 865 days ago
> Must have to do something with imprintment

Eh, I think most C programmers frustrations with UB stem from knowing that the C standard has fundamental flaws and modern compilers abuse that fact for "optimizations" on UB. This paper covers the topic pretty well[0].

I fully support Casey Muratori's viewpoint that undefined behavior should not exist in the standard[1]. Instead, the C standard should enumerate all valid behaviors compliant compilers can implement. This would allow compilers to make unintuitive optimizations for the platforms that need them, but still allow programmers to be certain that their program semantics will not change in different versions of the said compiler.

[0] https://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_20... [1] https://youtu.be/dyI0CwK386E?si=vsqJ8uWHY8xkGmFm

1 comments

> I think most C programmers frustrations with UB stem from knowing that the C standard has fundamental flaws and modern compilers abuse that fact for "optimizations" on UB

Do they actually know that? I don't think so. Let's take [0] for instance (the author is an expreienced C programmer who loves the language and wrote a re-implementation of bc in it):

    There seemed to be a lot of misunderstandings; I could not get a handle on what this person thought UB meant.

    I finally figured it out: this person’s definition of UB was not “the language spec can’t guarantee anything.” Instead, it was “compilers can assume UB does not exist and optimize accordingly.”

    Wat.
Yep. Apparently, that was news to him, even though C implementations have been behaving like that for about 30 years already, and you can read the rest of the post for the "this is evil, we the users must do something about it" take. And the proposed "something" is not "we should instead use a language with actually defined semantics", oh no. It's "use compiler flags to force more reasonable behaviour, hopefully" and "somebody should write boringcc, unfortunately, I am myself a bit too busy for that". Well, despite numerous pleas and several attempts, nobody has managed to write boringcc which is telling of something, I'm just not sure of what exactly. So... I think it is about imprinting: "Oh, it's a wonderful language, well, it would be if it was actually implemented the way I used to think it is implemented (and I still think it should be implemented that way) but still, it's a wonderful language if only not for that pesky realitiy" is forcing unfounded expectations onto reality, and where do these expectations even come from in the first place?

And yeah, I fully agree with you that C standard should've probably done that. But it didn't happen, and it can't happen because backwards compatibility [1]. But even then, C programs would still be non-portable, in a sense that you have to use #ifdef's for tinkering with platform-specific behaviour for anything interesting, because C standard even today leaves a lot of stuff completely up to implementation, see [2] for an especially apalling example, even without touching UB.

[0] https://gavinhoward.com/2023/08/the-scourge-of-00ub/

[1] https://thephd.dev/your-c-compiler-and-standard-library-will...

[2] https://thephd.dev/conformance-should-mean-something-fputc-a...

> Well, despite numerous pleas and several attempts, nobody has managed to write boringcc which is telling of something, I'm just not sure of what exactly.

John Regehr has a blog post that touches upon why this is the case [0]:

> I’ll assume you’re familiar with the Proposal for Friendly C and perhaps also Dan Bernstein’s recent call for a Boring C compiler. Both proposals are reactions to creeping exploitation of undefined behaviors as C/C++ compilers get better optimizers. In contrast, we want old code to just keep working, with latent bugs remaining latent.

> After publishing the Friendly C Proposal, I spent some time discussing its design with people, and eventually I came to the depressing conclusion that there’s no way to get a group of C experts — even if they are knowledgable, intelligent, and otherwise reasonable — to agree on the Friendly C dialect. There are just too many variations, each with its own set of performance tradeoffs, for consensus to be possible.

Granted, at the end he says that it's not that it's impossible, it's just not for him, so it's still possible that someone else would succeed, but they'd have an uphill battle for sure.

[0]: https://blog.regehr.org/archives/1287

Those ideas are not convincing. Consider the glibc bug from yesterday. If I understood it correctly, there is an integer overflow causing a heap overflow. The integer overflow was unsigned wraparound - so fully defined behavior. So no, we do not want latent bugs remain latent. We want UB sanitizers and checkers and fix all this crap.
I think that's arguably a distinct concern from what boringcc/friendly C are intended to address. My understanding is that the latent bugs those are intended to keep are those that would be unearthed by aggressive exploitation of UB. Sanitizers/checkers/etc. would still be useful iirc boringcc/friendly C for finding bugs arising out of fully-defined behavior.
Sanitizers / checkers are much useful if the behavior is defined, because you can then not be sure it is not intentionally used in the way allowed by the definition. So you can not check at run-time in production and stop the program but you can with UB and during testing you would need to analyze each case individually.
I think sanitizer usefulness for defined behaviors is going to depend a lot on how frequently those behaviors are intentionally used and how easy/hard it would be to suppress false positives.

There might be some behaviors that sanitizers don't currently catch but are used for optimizations as well, and boringcc/friendly C might be useful for those? Not entirely sure what those might be, if they even exist, though; IIRC sanitizers don't currently flag strict aliasing violations, but -fno-strict-aliasing exists so there's no need for a new compiler/dialect (for GCC/Clang?)

As the author of the first blog post, I think you got my conclusion wrong because I'm writing a language without such UB. So I would like to have a better language.

But I was also talking about UB in the context of already existing code! My argument is that compiler writers are breaking existing code when they could very well avoid breaking it.

We have tried to convince compiler writers to not do this, but they have refused. So that's why I said that users must do something: because compiler writers won't.

And yes, I knew compilers would take every advantage that they could before that; my surprise was that someone considered UB's sole purpose to be for the benefit of compiler writers at the expense of everyone else, including non-programmers who suffer catastrophic consequences for security bugs.

Actually the definition of UB is not "compilers can assume UB does not exist" but indeed "the language spec can't guarantee anything". We clarified this in the C23 spec. (although a limited form of "compilers can assume UB does not exist" can be derived from "the spec can't guarantee anything")
Well, I was under impression that word "anything" actually means "anything", so e.g. "the language spec can't guarantee that the compilers won't assume that the execution paths with UB on them never happen and then use that assumption for optimization" is a valid instantiation of "the language spec can't guarantee anything".
Actually, the spec does not say exactly "the language spec can't guarantee anything". It says "undefined behavior --- behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this document imposes no requirements." So it says "no requirements" but this refers specifically ("for which") to the behavior in question and not the whole program. This implies that preceding observable behavior can not be affected. (but non-observable behavior can according to the "as-if" rule and this is sufficient for optimization). We added a note to clarify this in C23.
Wow, those linked blog posts are... eye-opening.