|
|
|
|
|
by vgatherps
960 days ago
|
|
It would probably be even worse today. Dynamically discovering ILP “just works” even as memory gets slower and slower and slower. A CPU today can execute hundreds of instructions and many predicted branches ahead of a slow load. It would be impossible to statically schedule this (you don’t know what will/won’t be in cache), and difficult to try and hoist all loads 100 instructions in advance especially when you take branching behavior into account. GPUs have taken over much of the niche where these processors excel, number crunching where you have entirely pre-determined memory / compute access patterns. |
|