| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zdfjkhiuj 3073 days ago
	>The reason was data dependencies that I couldn't see, even in the assembly. I don't understand why that would matter. Aren't GPUs in-order? I don't know the low-level architecture of GPUs at all.

4 comments

corysama 3073 days ago

An easy explanation is that you can think of GPUs as being massively hyperthreaded. So, when one thread hits a data stall, another thread picks up to use the ALU resources until it hits a stall, and so on through many, many threads before it cycles back to the original. But, data stalls are very long. And, if you don't have enough ALU for the other threads to work on before they stall too, you'll end up back on the first thread waiting for data anyway.

If you want to understand low-level GPU architecture, https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-... is a great intro.

link

AstralStorm 3073 days ago

They are parallel, a data dependency cannot be pipelined as easily.

So no, they are not in-order.

Radeons do not have speculative execution but that doors not make them in-order.

link

fyi1183 3072 days ago

Radeons compute units are in-order though, in the usual sense of the word, and I'd love to hear it if there really was an out-of-order GPU. It'd be rather surprising.

One thing that they do have to deal with data dependencies is that load (and texture fetch etc.) instructions don't block. Instead, there's a separate instruction for waiting on the result of a previous load.

link

andars 3073 days ago

My understanding:

GPUs are (generally) in-order within each thread, but they are pipelined. The pipeline is filled with instructions that are ready to execute from across many threads. If all threads have an unmet dependency (previous instruction or memory access), the pipeline will stall.

link

dahart 3073 days ago

GPU compilers prefer to inline everything, and they try to reuse partial results if they can, so it’s easy to get out of order dependencies in places you might not expect.

link