Hacker News new | ask | show | jobs
by moondistance 360 days ago
Yes, for many applications.

Meta, OpenAI, Crusoe, and xAI recently announced large purchases of MI300 chips for inference.

MI400, which will be available next year, also looks to be at least on par with Nvidia's roadmap.

2 comments

(this is also why AMD popped 10% at open yesterday - this is a new development and talks from their 2025 "Advancing AI" event were published late last week + over the weekend)
Is the software stack still lacking?
Yeah it's still a few years behind but it's getting better. They are hiring software and tooling engineers like crazy. I keep tabs on some of the job slots companies have in our area and every time I check AMD they always have tons of new slots for software, firmware, and tooling (and this has been the case for ~3 years now).

They've been playing catch up after "the bad old days" when they had to let a bunch of people go to avoid going under but it looks like they are catching back up to speed. Now it's just a matter of giving all those new engineers a few years to get their software world in order.

They pay hardware rates to software engineers (principal engineer at the salary level of a decent fresh graduate) so I won't be too optimistic about them attracting software people that would propel them forward.
At least where I live (very much not west coast), their SW and HW rates are at or above what we normally see in this area.
Stock is undervalued. If you get in now and it pops over the next few years, it'll likely make up for lower compensation.
You don't need to work at AMD to buy their stock.
True, but if you don’t have a job, where’s the money for buying stock coming from?
We're forbidden to trading our own stock anyway, SEC regulation on insider trading and all.
You're "talking your book".
They pay terrible and still have legacy old guard managers. If you try to innovate on software you should look elsewhere or really make sure your manager knows what’s what
FWIW for the first time in 2+ years I managed to compile llama.cpp with ROCm out of the box and run a model with no problems* on Linux (actually under WSL2 as well), with no weirdness or errors.

Every time I have tried this previously it has failed with some cryptic errors.

So from this very small test it has got way better recently.

*Did have problems enabling the WMMA extensions though. So not perfect yet.

If this has been an issue for two years, then it's not rocm or llama.cpp problem.
Oh I'm sure you are right its operator error, but I'd always have some issue installing rocm and getting the paths right or something. This is the first time I've managed to install rocm following the commands exactly and then compile llama.cpp without having to adjust anything.

BTW, this kind of dev experience does really matter. I'm sure it was possible to get working previously; but I didn't have the level of interest to make it work - even if it was somewhat trivial. Being able to compile out of the box makes a big difference. And AFIAK this new version is the first to properly support WSL2, which means I don't have to dual boot to even try and get it working. It's a big improvement.

You can blame the user for not using the tools correctly or the manufacturer for making difficult to use tools that aren’t straightforward or don’t work in various non happy path conditions (ie unreliable installers).

For example, to this day installing MSVC doesn’t make a default sane compiler available in a terminal - you have to open their shortcut that sets up environment variables and you have to just know this is how MSVC works. Is this a user problem or Microsoft failing to follow same conventions ever other toolchain installer follows?

Yes, big time, but there continues to be lots of progress.

Most importantly, models are maturing, and this means less custom optimization is required.

Yes I'd agree with that. There is so much demand for inference which is maturing rapidly that even if a lot of the "R&D" is done on NVidia cards because of their (vastly, let's be fair) software stack, if AMD is competitive on the inference side (and perhaps more importantly have shorter lead times) then doing the inference on AMD is still an enormous market.

I suspect we will (or already are?) at a point where 95%+ of GPUs are used for inference, not training.