Hacker News new | ask | show | jobs
by lelanthran 1641 days ago
> I am lost here, the mentioned bugs are a result of optimizations like speculative execution, branch prediction, prefetching etc.

>

> These are language independent optimizations. For example, any language (that allow for loop like constructs) compiled to intel machine code and executed on intel processor will be exposed to these bugs, it is not C specific. Am I missing anything?

1. There's machine code that is exposed to the ISA (public machine code that compilers generate) and there's machine code that exists and is used but is not exposed.

2. The author is making the argument that the machine code that is exposed is designed around the memory model of the C programming language, which itself was designed around the memory model of the PDP.

Put the above two together and (if you squint really hard, and ignore things like logic and reason) the conclusion is that the modern x86/X64 ISA is suboptimal because of the PDP.

The actual reality is that all the popular programming languages are imperative, have the concepts of stack, heap and in-order execution of instructions.

Because all languages appear to converge on the same basic concepts in order to be commonly accepted, I think that it is doubtful that any alternative machine and memory model would have arose in the absence of a language like C or a machine like the PDP.

I think this because of the existence of other languages that offer alternative machine and/or memory models, and those languages have existed for decades without being popular.

1 comments

> The actual reality is that all the popular programming languages are imperative, have the concepts of stack, heap and in-order execution of instructions. Because all languages appear to converge on the same basic concepts in order to be commonly accepted, I think that it is doubtful that any alternative machine and memory model would have arose in the absence of a language like C or a machine like the PDP.

But as we can see, this model could not keep up with performance improvements so much more complexity got implemented beneath the surface of the old model. The author’s point is to be aware of the mismatch here, and that perhaps we should stop believing the “lies” what C tells us.

I personally believe that we would be much better off with lower level instructions exposed to us, and putting the complexity in software. That way CPU vulnerabilities could be patched, and I believe we could create much better optimizations, and faster CPU design iterations.

At least some of this complexity and abstraction layer provided by the hardware is done so that the same binary can run on multiple different implementations, though. If you expose low level instructions corresponding to the specific CPU implementation then you lose "this app runs on all these Android phones" and also "this process can migrate between CPUs in a big/little setup", which would be unfortunate.
Well, I meant it more in terms of x86 to microcode JIT compilers, but in software. So even existing code can potentially be run in exactly the same way they do know, but instead of cumbersome hardware pipelines, these could be done entirely in software where the complexity ceiling is perhaps somewhat higher. This JIT compiler could do the same “magic” what current CPUs do, reorder, branch predict, etc and even more, while in case of a bug those can be fixed without buying a new processor.
Ah, so a Transmeta style approach? That's certainly feasible, in the sense that their technology worked, but I think it would be tricky at best to match the performance of the more standard do-it-in-hardware approach.
> I personally believe that we would be much better off with lower level instructions exposed to us, and putting the complexity in software

There have been several initiatives that sound vaguely like that, none of which actually worked out commercially. Principally Itanium. At the same time, there is no question that it's possible to gain a lot in performance if you're both willing and able to use a programming environment like CUDA.

It seems to me that the article never actually articulates an alternative in enough detail to take seriously. It doesn't seem to make any falsifiable claim.

In my view the most plausible explanation for why things in this area look the way they look was articulated in DJB's "The Death of Optimizing Compilers" talk. I'm not surprised that this ACM piece was written by somebody that works on optimizing compilers. Perhaps that shouldn't be relevant, but I can't help but suspect that it is.