Hacker News new | ask | show | jobs
by modeless 1260 days ago
The hardware is not their major problem. They have been failing super hard at the software side of machine learning for a solid decade now.

It seems like pure management incompetence to me. They need to invest a whole lot more in software, integrating their stuff directly into pytorch/TF/XLA/etc and making sure it works on consumer cards too. The investment would be paid back tenfold. The market is crying out for more competition for Nvidia and there's huge money to be made on the datacenter side but it all needs to work on the consumer side too.

2 comments

AMD has finite resources, like any company, and they’ve been focusing on CPU/datacenter dominance, which to me is both the safer bet and the more-lucrative bet. It wasn’t that long ago that AMD was on the brink of bankruptcy (~2016), so I appreciate that they’re not trying to divide their attention.

Their attempts at entering the ML space so far have been failures, and they are wise to hold off on really competing with Nvidia until they have the bandwidth to go “all in”. Consciously NOT trying to compete with Nvidia is the reason they didn’t go bankrupt. Their Radeon division minted from 2016-2020 because they focused on a niche Nvidia was neglecting- low-end/eSports (also leveraging their APU expertise to win PS4/Xbox contracts).

I think Nvidia will eventually lose its monopoly on ML/AI stuff as AMD, Apple, Qualcomm, Amazon and Google chip away at their “moat” with their own accelerators/NPUs. As mentioned though, the Nvidia Edge really comes from CUDA and other software, not the hardware. I doubt that Apple, Qualcomm, Amazon or Google will be interested in selling hardware direct to consumers. They want that sweet, sweet cloud money and/or competitive advantages in their phones (like photo processing). I don’t want to be paying AWS $100/mo for a GPU I could pay $600 once for. I do think AMD/RTG will go hard on Nvidia eventually, and it will not matter whether you have an AMD or Nvidia GPU for Tensorflow or spaCy or whatever else.

> AMD has finite resources, like any company, and they’ve been focusing on CPU/datacenter dominance, which to me is both the safer bet and the more-lucrative bet.

Cloud/datacenter based ML is a huge growth market. Having the same software work on a consumer GPUs and enterprise ML cards is one of Nvidia's competitive advantages.

> AMD has finite resources, like any company, and they’ve been focusing on CPU/datacenter dominance

Why can't large companies tap the investment market? E.g. they could sell bonds to fund it, borrow, etc.

Well they definitely can, and do. They have made the decision that the reward isn't worth the risk.

The idea (not aiming at you here, you didn't say this) that senior leadership at AMD is unaware of NVIDIA's lead in this space, and haven't repeatedly considered whether to invest in competing, is absurd. Likewise the idea that anyone outside of AMD understands better than AMD does what it would take in terms of investment _and opportunity cost_, is also absurd.

Senior leadership at AMD isn't dumb. The fact that they're not doing something we want doesn't make them dumb, either. Again, not aiming at you with this little rant :)

They may not be dumb, but they may have narrow vision (which I suppose is still dumb). Bad decisions in top tiers of the largest world companies are not unheard of.
Yeah but no one can deny Lisa Su is an MVP and knows what she’s doing. She’s borderline overqualified for CEO of AMD IMO
> but no one can deny Lisa Su is an MVP and knows what she’s doing

In her area. We have no real info on how proficient she is in other areas like ML.

> The idea that senior leadership at AMD is unaware.. Senior leadership at AMD isn't dumb.

Lets try this with another company:

The idea that leadership at Lehmon Brothers is unaware of the fact that they are trading subprime loans is absurd! The leadership isnt dumb

The Idea that leadership at Being is unaware of safety issues with 737 Max is absurd! How could you suggest that anyone outside boesing understands better than they do the risks involved?

The fact that a couple of other companies have had dumb leadership in no way proves that AMD's is dumb. You're essentially claiming that because Lehman and Boeing had dumb leadership, all senior leadership at all companies is dumb (because there's nothing linking AMD to these other companies except that they're all companies). And that is an absurd claim.
I am not sure if they are dumb, the jury is still out, but your entire post is the textboom example of logical fallacy, appeal to authority.

You are claiming that AMD leadership is infalliable based on no evidence whatsoever. Thats whats absurd

The idea that AMD management can't possibly have made any bad decisions is the absurd thing here. It's entirely possible that AMD carefully considered Nvidia's position, carefully considered their strategy, and confidently made the wrong decision. It happens all the time in all sorts of companies.

I think it's very clear with the benefit of hindsight that not investing enough into the software side of deep learning early on was a bad decision. But it was obvious to me even at the time and I said as much to anyone who would listen (e.g. seven years ago https://news.ycombinator.com/item?id=12258027)

Maybe they are not dumb. But my next GPU will be nVidia because of their decision. Hence I am disappointed at not having a competitive rival.

I may not know better. But I know what I like.

Can you guess why they would be putting hardware into the ML workspace field but not backing that up with equivalent software integration?
Last I checked they see deep learning training as a niche market, their strategy is to try to win big contracts (HPC etc) and then supply software specifically for that. Then "the community" will supply software. Having spent a bunch of time beating my head on this and related walls it's not clear to me that they're entirely wrong from an economic standpoint. Remember that 2/3 public cloud providers have their own chips as well as NVIDIA's so it would be tough to negotiate a good deal. As a user it's super irritating to be stuck on NVIDIA especially when Jensen gets up on stage to say "haha, Moore's law is over, stop expecting our products to get cheaper."
I hope they change their minds. At least now that generative models are becoming somewhat popular. I'd love to be able to get an AMD card to run generative models, but to the best of my knowledge, they only run on Nvidia hardware
No personal experience, but you can actually get Stable Diffusion to run on AMD cards.

It uses DirectML on Windows: https://gist.github.com/averad/256c507baa3dcc9464203dc14610d... This is thanks to Microsoft, not AMD.

On Linux you can use ROCm: https://www.videogames.ai/2022/11/06/Stable-Diffusion-AMD-GP...

The horrible install processes and what a mess this is is all down to AMD.

I don't have any experience with DirectML but it sounds promising.
I wouldn't hold my breath, and anyway at this point NVIDIA has faster chips and more supported software all the way down the stack. My previous startup tried to solve some of these problems and we built what is as far as I know still the only reasonably complete device-portable deep learning framework. Today something like an RTX 3070 is a good budget option for small experiments and you can always lean on a cloud provider if you need more compute temporarily. Hard to beat a TPU pod when you're in a hurry.