|
|
|
|
|
by docfort
1082 days ago
|
|
The article mixes a few things together. It essentially describes the von Neumann bottleneck, a consequence of the literal way many traditional computer architectures interpreted Turing’s vision. The thrust of the article is to warn performance engineers to not confuse the OS scheduler view (of whether a CPU is available for more work) and the micro-architectural view (of whether you expect the CPU to retire more instructions for a given number of clock cycles). In my experience on large ARM cores, the max IPC can be high, but programs that do useful work rarely achieve it. Scientific code intended for HPC makes good use of vector units or just superscalar processing, along with (manual) interleaving of compute and memory I/O. Other code, like most web browsers, can hit IPC=1, only after a ton of tuning. Both categories are important, but usually the pot of money is larger for the HPC code, or at least the optimization path is clearer. In other words: the article is intended primarily for someone to understand when they might want a performance engineer and not just call it a day when they see full CPU scheduling utilization. |
|