| HN Mirror

I'll leave your first question to the other comment here from frogblast, as I really battled with how to answer it well, given my limited knowledge and being an elbow deep into an analogy, after all. I got a writer's block, and frogblast actually answered something :D

> how akin GPUs "stream multiprocessors"/cores are to CPUs ones at the microarchitectural level?

I'd say, if you want to get a feel for it in a manner directly relevant to recent designs, then reading through [1], [2], subsequent conversation between the two, and documents they reference should scratch that curiosity itch well enough, from the looks of it.

If you want a much more rigorous conversation, I could recommend the GPU portion of one of the lectures from CMU: [3], it's quite great IMO. It may lack a little bit in focus on contemporary design decisions that get actually shipped by tens of millions+ in products today and stray to alternatives a bit. It's the trade-off.

> Are they out-of-order?

Short answer: no.

GPUs may strive to achieve "out of order" by picking out a different warp entirely and making progress there, completely circumventing any register data dependencies and thus any need to track them, achieving a similar end objective in a drastically more area and power efficient manner than a Tomasulo's algorithm would.

> Do they do register renaming?

Short answer: no.

[1] https://forums.macrumors.com/threads/3d-rendering-on-apple-s...

[2] https://forums.macrumors.com/threads/3d-rendering-on-apple-s...

[3] https://www.youtube.com/watch?v=U8K13P6loyk ("Lecture 15. GPUs, VLIW, Execution Models - Carnegie Mellon - Computer Architecture 2015 - Onur Mutlu")