Hacker News new | ask | show | jobs
by juggertao 875 days ago
Yet GPUs which take HT to the next level by having thousands of "hyper-threads" work very well for scientific computing.
1 comments

Their design is completly different, it isn't shared execution units like it happens on CPUs.
But it is. GPUs have many more threads in flight than execution units.
Threads groups get exclusive resources in SIMT execution pipelines.
And at memory stall they are exchanged with other waiting thread groups.

Just like HT.

Scheduling algorithm is different.

CPUs target low latency (they switch often). GPUs target high troughput (they switch rarely, only when needed).

High troughput algorithms dont have problem with a lot of threads. Low latency algorithms have problem with a lot of threads (they need lot of cache memory because of constant switching).