Hacker News new | ask | show | jobs
by The_Colonel 686 days ago
Out-of-order execution doesn't imply parallelism, it was designed from the beginning to work around the input data availability problem.

In speculative execution and branch predictors, prefetch may seem just as a nice bonus, but given that nowadays CPU performance is largely bottlenecked by memory access, prefetch resulting from these techniques will often come out as the dominant performance factor.

2 comments

It's still a form of parallelism, that could in principle be written into the program instead of being automatically implemented by the processor.

For example, in hand crafted assembly programs, it's sometimes common to know how long a fetch operation lasts, and manually schedule operations such that they can be executed in parallel with the fetch operation.

Theoretically a high level language could also be designed to expose this kind of logic to the programmer. A program in such a language would be expressed as a set of very very short threads that can be interleaved by the compiler given precise knowledge of instruction timers.

OoO came about after multi-issue architectures were starved for instructions to execute due to on-chip blockers like execution unit availability, data hazards, pipeline bubbles, register availability, branching, cache ports. You can call those input data availability problems but it's not availability from offchip memory. So in actual history, yes it was for parallelism (keeping multiple execution units busy).

OoO did have the side benefit from possibly executing past a few memory stalls but those were secondary. OoO reodering resources were sized for addressing the stalls from on-chip timescale things. Today the resources are bigger, but even bigger is the relative memory latency (how many insns you could execute in the time it takes to service a main memory fetch that your pipeline depends on).