Hacker News new | ask | show | jobs
by mpreda 514 days ago
And the splitting into CDNA and RDNA comes from the same direction: market segmentation, to allow much higher prices for the CDNA data-center GPUs, while keeping the gamer-focused RDNA GPUs affordable for mere mortals. Of coures this backfires by making the powerful GPUs not available for mostly anybody anymore to experiment on.

For example this blog post, about how great MI300X is. Really, what do I care -- I'm not a billionaire.

2 comments

> And the splitting into CDNA and RDNA comes from the same direction: market segmentation

Not really.

Wave64 on CDNA is provably more throughput. But with most video game code written for NVidia's Wave32, RDNA being reworked to be more NVidia-like and Wave32 is how you reach better practical video game performance.

HPC will prefer the wider execute, 64-bit execution, and other benefits.

Video Gamers will prefer massive amounts of 32MB+ of "Infinity cache", which is used in practice for all kinds of screen-space calculations. But this would NEVER be used for fluid dynamics.

Maybe never by the big players, but RDNA and even fp32 are perfectly fine for a number of CFD algorithms and uses; Stable Fluids-like algorithms and Lagrangian Vortex Particle Methods to name two.
I'm talking about Wave64.

CDNA executes 64-threads per compute unit per clock tick. RDNA only executes 32-threads. CDNA is smaller, more efficient, more parallel and much higher compute than RDNA.

Furthermore, all ROCm code from GCN (and older) was on Wave64, because historically AMD's architecture from 2010 through 2020 was Wave64. RDNA changed to Wave32 so that they can match NVidia and have slightly better latency characteristics (at the cost of bandwidth).

CDNA has more compute bandwidth and parallelism. RDNA is narrower, faster latency and less parallelism. Building a GPU out of 2048-bit compute (aka: 64-lanes x 32-bit wide/CDNA) is always going to be more bandwidth than 1024-bit compute (aka: 32-lanes x 32-bit wide) like RDNA.

I wasn't familiar with the "Wave32" term, but took "RDNA" to mean the smaller wavefront size. I've used both, and wave32 is still quite effective for CFD.
ROCm support for RDNA took like 2 years, maybe longer.

If you actually were using both, you'd know that CDNA was the only supported platform on ROCm for what felt like an eternity. That's because CDNA was designed to be as similar to GCN so that ROCm could support it easier.

--------

What I'm saying is that today, now that ROCm works on RDNA and CDNA, the two architectures can finally be unified into UDNA. And everyone should be happy with the state of code moving forward.

They’re unifying the architectures. AMD will move to UDNA for both gaming and data center. The next graphics cards after RDNA4 will be UDNA. Makes sense given how ML-heavy graphics has become.
The point is they shouldn't have done it in the first place. It was obvious right from the start it's a bad idea, except maybe for temporarily boosting short term profits.

The whole AMD AI/ML strategy feels like this - prioritize short term profits and completely shoot themselves in the foot in the long term.

ROCm was clearly designed with Wave64 in mind. It was going to take years for ROCm to be reworked for Wave32 of RDNA.

DirectX shaders however were already ready for Wave32, and other architectural changes that RDNA had. In fact, RDNA was basically AMD changing their architecture to be more "NVidia-like" on many regards (32-wide execute being the most noticeable).

CDNA existed because HPC has $Billion+ contracts with code written for Wave64 and still needing ROCm support. That means staying on the older GCN-like architecture and continuing to support say, DPP instructions or other obscure features of GCN.

---------

Remember how long it took for RDNA to get ROCm support? Did you want to screw the HPC customers for that whole time?

Splitting the two architectures, focusing ROCm on HPC (where the money was in 2018 for GPU Compute research dollars), and focusing on better video game performance for RDNA (where money is for video game / consumer cards) just makes sense.

>The whole AMD AI/ML strategy feels like this - prioritize short term profits and completely shoot themselves in the foot in the long term.

That's what the stock market rewards.