Hacker News new | ask | show | jobs
by dragontamer 1029 days ago
Typical GPUs are easily 6000+ shaders (aka kinda-sorta like cores) on the more expensive end.

At least, 6000+ 32-bit multiplies per clock tick on ~2GHz+ clocks. Even cheap GPUs easily are 2000+ shaders.

> GPUs have around 32 or 64 physical cores

NVidia SMs and AMD WGPs are not "cores", they are... weird things. They have many shaders inside of them and have huge amounts of parallelism.

As far as grunt-work goes, a "multiplier unit" (literally A x B) is perhaps the most accurate count to compare CPU cores vs GPU "cores", because the concept of CPU-core vs GPU WGP / SM is too weird and different to directly compare.

Split up that WGP / SM into individual multipliers... and also split up the ~3 64-bit multipliers or ~48 CPU SIMD multipliers per core (3x 512-bit on Intel AVX512 cores), and its perhaps a more fair comparison point.

---------

Back 20 years ago, you'd only have 1x multiplier on a CPU core like a Pentium 4, maybe as many as 4x with the 128-bit SSE instructions.

But today, even 1x core from Intel (3x 512-bit SIMD) or 1x core from AMD (4x 256-bit SIMD) has many, many, many more parallel elements compared to a 2004-era CPU core.

1 comments

>NVidia SMs and AMD WGPs are not "cores", they are... weird things. They have many shaders inside of them and have huge amounts of parallelism.

They aren't weird things. They are the equivalent of CPU cores. By your logic CPU cores aren't CPU cores, "they are... weird things" because of SMT.

There is more weirdness here than just SMT.

The full crossbar, allowing each shader to individually issue a fetch from memory. The shared memory space is not like cache but instead is a shader-to-shader communication scratchpad.

Atomics support, coalescing atomics together.

-------

I mean hell: what is a core? Do remember that on SMs, every single shader (not SM) has its own instruction pointer.

Is the shader a core? No, not really. But SMs aren't a core either.

I wouldn't compare GPU and CPU architecture at all. They're just different. What I did above, breaking both down into individual multipliers then counting them seems like the best way forward, especially as we remain multiplier bound in practice.