Hacker News new | ask | show | jobs
by lostcolony 1874 days ago
I love seeing others bring up Chesterton's fence; it's been a reference that comes to mind with quite a lot of the WTFery I've encountered in my career (usually it remains WTFery even when looking for underlying reasons, but it at least helps remind me to question my instincts).

I don't really know enough to weigh in on this, but I can say that having pursued a lot of WTFish things in my career so far, 90% of the times I've encountered bad decisions, the explanation for it was either "it was done that way because legacy reasons" (i.e., it had to be done that way then, the reason it had to be has changed, and now it would break things to do it 'correctly') or "it was easier" (i.e., at the time the badness wasn't really going to affect anyone, or not measurably, or was very intentional tech debt, and it's only 'now' that anyone is noticing/caring).

3 comments

I've seen people make bad architectural decisions that now the company is stuck with. And it comes down to just the fact that it was a bad decision, no second guessing needed.

I've also seen "bad" decisions made due to outside constraints. These decisions look like bad decisions, except that if you try to "fix" those decisions, it becomes a lot harder than it looks.

Don't get me wrong, there are plenty of times it was cluelessness. I'm just saying, I find myself going "this is stupid" far more often than it -was- stupid. It might be now, but the reasons for it then sometimes made sense.
In this case, "it was done that way because legacy reasons" is close, but the real answer is "it was done that way because we hadn’t yet invented the parts of compiler theory required to create compilers that enforce this constraint at the type level."
All this compiler sophistication represents a step backwards for binary interfaces. For example, C++ compilers emit such incredible machinery that it's essentially impossible for foreign code to interface with the compiled objects at the binary level. As a result everything eventually gets reduced to the C ABI: simple symbols and calling conventions.
That's... what we're talking about. Simple symbols with calling conventions.

The rules for this proposed ABI are exactly the same as the existing amd64-SystemV C ABI, with one difference: the stack-to-stack copies aren't generated at the call-site; instead, the generated code at the call-site passes the address (in a register, or spilled to stack) for what it would have copied. The compiler generates the stack-to-stack copy in the generated function's prologue, using the address it was passed. Nothing more, nothing less. It's just moving the required location for certain generated code across the linkage, and keeping a temporary alive a little bit longer to make that work. (And in exchange, the temporary that the local stack variable gets put in isn't created at the call-site, so the register-file "pressure" of the change is net neutral.)

This is no more or less complex than the current ABI. It doesn't create more exceptions or edge-cases than the current ABI. It doesn't make the ABI harder to implement. The only thing it does, is choose differently in the matter of a basically-arbitrary choice of where to put some generated glue code (the stack-to-stack copy).

The only practical upshot of this change, is that this enables compilers to sometimes do an optimization that they can't currently do, because doing said optimization would go against the rules of the amd64-SysV ABI (i.e. a caller that pushed a register instead of copying the value wouldn't be an amd64-SysV caller any more, and wouldn't be compatible with precompiled amd64-SysV callees any more; and vice-versa for the callee.)

But if-and-when a compiler does do that optimization, it's internal to the generated function. It doesn't mean that there are two potential callee "signatures" under the proposed ABI. There's only one.

Here's what the proposed ABI would probably say about stack copies:

> "The caller always passes large values by reference; the callee always receives them by reference. If the callee is taking a parameter pass-by-value, then it's up to the compiler of the callee to insert code into the callee's function prologue to turn the passed reference into a stack-local copy of the referenced data."

With that particular legalese, the callee's generated copy is still "required" by the spec, but its effects are now also "hidden" from the caller — i.e. its observable results are no longer leaking across the linkage. Therefore, the compiler is now empowered to optimize out the callee copy, as long as it can ensure the resulting code has observably equivalent results from the caller's perspective.

Note that this isn't anything the person implementing the ABI targeting code in the compiler has to worry about. They just write the code to generate a callee function prologue that does a stack-to-stack copy. It's the person writing the optimization pass that comes after that codegen step, who can now can take that stack-to-stack copy and — static proof of read-only access by the callee in hand — drop it out.

The optimization opportunity being enabled by the change, isn't part of the ABI's spec. The proposed ABI is just about moving the stack-to-stack copy into the callee. What the compiler chooses to do when targeting an ABI where the callee does stack-to-stack copies, is up to the compiler. Presumably, it will do "whatever fiendish things it can" at -O3, and "nothing much different" at -O0. Like usual.

And either way, the linkage itself looks the same. The optimization doesn't change the linkage. Any and all tooling that examines the linkage — debuggers, disassemblers, tracers, etc. — would see the same thing, whether the optimization has occurred or not. Because the optimization isn't part of the linkage; it's internal to the codegen of the callee, enabled by the (uniformly!) modified structure of the linkage.

Yup, there's also time dependence. Perhaps someone wrote some software in COBOL that is hard to maintain now. But rewritng it may not be worth the opportunity cost now, especially for well-tested systems that have been around for a long time and which have critical failure modes. Sometimes it's better to leave things alone and work around them, even if it results in an uglier design.