Hacker News new | ask | show | jobs
by clarry 3285 days ago
You are free to pave the way.

But pretending that compiler developers are currently just ignoring the issue out of mischief and preference for performance above anything else is disingenous. If the problem were so simple, we'd probably already have a dozen free static analyzers that do a good job, and you'd be happy to use them.

The thing is, "detecting" UB in the sense you imply is usually not what happens in the context of said optimizations (that's not to say they won't attempt to do it... ever seen a compiler warn you about use of uninitialized variable?).

What compilers do is they assume the program is correct. And following that assumption, they perform an optimization that is only correct for correct programs. That is in fact very simple to do. They do not try to prove or disprove that program actually invokes UB there -- that is impossible in general, and even in the subset of cases where it is possible it could require deep whole-program (including libraries!) analysis that could take massive computational resources.

Many people here keep trivializing the problem but I don't think they understand the problem at any depth.

And that is why I say you should pave the way, not in a smug "fuck off get off my lawn fix your own problem" sense, but to get people to honestly gain some background in program analysis, read research papers, study existing analyzers, and gain some appreciation for what it takes. It is far, far from trivial. Especially if you can't just take the language and change it to your liking (breaking nearly all existing code) until all the hard stuff is out.

And in saying that, I suggest that it is easier to start with a new language (or, at least some existing language other than C) that was designed from ground up with such analysis & correctness provability in mind.

2 comments

I might actually end up doing that.

But honestly what a waste of resources. I would prefer to be able to work on topics that are not problems created by a recent shift of view of how compiler should be designed and optimize, shift that is for the worst IMO.

And yep I know the reason in the current design for why it is not always easy. That's why I added the caveat that maybe they should be changed.

But then you can not just says that all the users criticizing and "trivializing" the problem are in the wrong and do not know what they are talking about. First, most of the discussion I've seen either points at technical justification of why those crazy optim should not be done, be it for safety, possibility of real optimizations by the programmers (and to compare, I'm not impress about the capability of compilers to take poor code and to transform it to somehow less poor code, especially when the language in question is C, that always required mastering); or points to discussions of compiler authors that are borderline insane (when you see them discussion about how technically the standards would allow to not consider uint8_t and char as alias, a

I might actually end up doing that.

But honestly what a waste of resources. I would prefer to be able to work on topics that are not problems created by a recent shift of view of how compiler should be designed and optimize, shift that is for the worst IMO.

And yep I know the reason in the current design for why it is not always easy. That's why I added the caveat that maybe they should be changed.

But then you can not just says that all the users criticizing and "trivializing" the problem are in the wrong and do not know what they are talking about. First, most of the discussion I've seen either points at technical justification of why those crazy optim should not be done, be it for safety, possibility of real optimizations by the programmers (and to compare, I'm not impress about the capability of compilers to take poor code and to transform it to somehow less poor code, especially when the language in question is C, that always required mastering); or points to discussions of compiler authors who are borderline insane (when you see them discussion about how technically the standards would allow to not consider uint8_t and char as alias, and they ask themselves if they could not "optimize" more thanks to that, you clearly understand they have completely lost it, and that things will end up badly); or points to bugs "exposed" by that class of optims (I would even say introduced, because 1) when your binary have been OK on all your target platforms for a few decades, and a new compiler come and break it for a technicality, it can be argued that regardless of said technicality the problem is mainly with the compiler; and 2) compilers compromise e.g. for technically invalid benchmarks, so they have no credibility when they don't take into account major breakage in major software, they are just acting like spoiled child who just want to do what they want regardless of the consequences).

What is never addressed by C/C++ compiler authors and their fan base is why the "nonportable" part of the standard is not taken into consideration anymore. This was the original spirit of UB. NOT: "this will permit EXTRA optimization in abstract compiler code"; the only "optim" were because of differences between target processors, and leaving things UB could yield a better mapping to the instruction sets (a trivial mapping) and it was considered that the programmer knew about them, and could use them when needed.

Now there is no mapping anymore. We all have to target the common denominator of all the existing target past, present, even dead ones. The result is a very very very poor language. Unsuitable to more and more things.