|
|
|
|
|
by sharpneli
1267 days ago
|
|
That part is bit weird because checking data dependencies is precisely what the OoO does. If that was strict part of what superscalar is then OoO moniker would be redundant. If one doesn't execute things out of order then checking for dependencies is useless. However above that is the simple definition, which is that as long as it executes more than a single instruction per clock it's superscalar. Even SIMD is taken as an example of a superscalar CPU. As for the original claim the ia64 is very much a superscalar CPU. All VLIW designs are superscalar. VLIW can be also thought as OoO superscalar CPU with the reorder buffer and dependency analyzer ripped off and exposing the execution units explicitly in their gory details. Or alternatively we can also say that VLIW already won, we just added a big honking chip on top to JIT compile machine code into VLIW micro-ops. |
|
Even in order non-superscalar cpus need some kind of dynamic checking if they allow out of order completion by allowing succesive low latency instructions to execute under the shadow of a preceding high latency instruction [1]. I'm not an cpu architect but this tracking is much simpler than the register renaming of OoO and only relies on hardware interlocks.
I think the grandparent is right in distinguishing VLIW, especially exposed pipeline ones, from superscalar as they have no tracking at all and just naively issue bundles; I think it is an useful distinction.
EPIC is more complicated as while it allows expressing intrabundle parallelism, it also allow dependencies and so it does need hardware interlocks. You could argue either way.
I think SIMD by itself should not be considered superscalar as it is still executing a single instruction is a single execution unit (compare the term superscalar itself to vector computation). Of course a superscalar CPU could have the capability to issue distinct SIMD instructions in parallel (for example larrabee).
CPU microarchitecture details are fun!
[1] there are many examples, starting with the CDC 6600; quake was also famous for implementing texture perspective correction by scheduling a division every 16 pixels and using fast approximation for the remaining pixels.