Hacker News new | ask | show | jobs
by pjmlp 875 days ago
Their design is completly different, it isn't shared execution units like it happens on CPUs.
1 comments

But it is. GPUs have many more threads in flight than execution units.
Threads groups get exclusive resources in SIMT execution pipelines.
And at memory stall they are exchanged with other waiting thread groups.

Just like HT.

Scheduling algorithm is different.

CPUs target low latency (they switch often). GPUs target high troughput (they switch rarely, only when needed).

High troughput algorithms dont have problem with a lot of threads. Low latency algorithms have problem with a lot of threads (they need lot of cache memory because of constant switching).