Hacker News new | ask | show | jobs
by 0xDEAFBEAD 21 days ago
I still think NVIDIA is a bad bet--where is their moat in the long term? Doesn't the sort of work NVIDIA engineers do look vulnerable to AI-assisted automation? NVIDIA engineers code against a well-defined test suite/specification, right?
7 comments

Their moat is cuda and cuda libraries and everything built on top.

When a new architecture drops, it's always PyTorch running on CUDA, other PyTorch backends are best effort, even if they reach feature parity, many industry power users went closer to the metal to squeeze performance and that stuff is too specific to Nvidia stuff.

if there is something that will beat Nvidia, it won't be something reaching feature parity with slightly better economics (like AMD, also Nvidia could just reduce their margins), it needs to be a novel approach worth rewriting the codebase for (maybe Cerebras, maybe a new player).

> Their moat is cuda and cuda libraries and everything built on top

Sure, but to state the obvious that is only a factor for people using CUDA !

There are also whole segments of the AI market, like Google using TPUs, Amazon using Trainium chips where CUDA is irrelevant.

If the AI boom is really going to happen, then inference volume needs ramp up and dominate training costs, and the winners are going to be whoever can do inference the cheapest, which probably isn't going to be anyone paying the NVIDIA tax !

The benefit of CUDA is more for development, and the hyperscalers serving models that use CUDA APIs - bespoke business models. Anthropic currently support both CUDA and Trainium, and X.ai (who seem to be fizzling out) are CUDA, although there was some talk of Musk getting Samsung to make "AI chips" of some sort.

As far as AMD goes, I'm sure the developers at AMD's biggest sites - the exascale national labs - have a whole other level of support than consumers, and no doubt a toolset that works great for those fixed environments.

I don't understand why AMD can't offer a drop-in replacement for cuda which implements an identical API.

How much actual diversity is there among standard AI workloads? I would expect this is an 80/20 thing where 80% of the workload uses 20% of the features.

>Nvidia could just reduce their margins

Commoditization is great for stock prices ;-)

3 things, they can, there is a precedent for that with Google v. Oracle for Java, and they have something!

AMD engineered something called HIP which is CUDA API compatible libraries that targets AMD's hardware, it's the closest thing we have for drop-in replacement to Nvidia's software moat.

It works for simple stuff but loses terribly for frontier kernels (like Flash Attention 3), novel approaches (e.g. Mamba) or networking (e.g. NCCL), also they are rough on the edges, so what you gain from GPU costs is lost in engineering cost.

My previous company tried to compete in this GPU game while putting effort to have a good software stack (Rivos), drop in replacement and cheaper with decent software.

But that vision was rough, any new player had to implement the bad APIs due to backward compatibility concerns, following specs wasn't sufficient as a lot of the AI stack was depending on observable effects (Hyrum's Law), and Nvidia simply just had a long head start, the company is now dead (acquired by Meta) and AFAIK there isn't another player.

Best case scenario AMD puts more effort into their software stack but I just think they do not have enough internal talent to compete.

Training will continue to be an Nvidia's thing and that's where most of the money sits, unless suddenly the AI research scene pivots to using JAX but I do not see it coming any time soon, if anything, I've seen internal efforts at Google to make PyTorch work nicely with TPUs. Some players like Anthropic started using JAX for training but all the small players are using Nvidia, I'm guessing it has something to do with Nvidia partnering aggressively with startups.

I think AMD have essentially given up on the consumer / small scale GPU compute market, while being extremely successful selling their AI chips to much bigger customers. Some of the biggest supercomputers (clusters) in the world, such as the Lawrence Livermore and Oak Ridge exascale computers, are AMD Instinct based, but the tools and level of support they get is not going to be the same as someone at home trying to get ROCm running on their gaming card.

I wonder how big the market is for consumer/etc vs these massive installations?

> I don't understand why AMD can't offer a drop-in replacement for cuda which implements an identical API.

AMD, Apple and Intel all sell raster GPUs. Their GPU architecture is not optimized for general-purpose compute, and reorienting around that goal would create a "Fifteen Competing Standards" scenario pretty quickly. It's as much of a hardware issue as it is a software one, and none of these businesses like to cooperate (see: the last 15 years of Khronos drama).

In AMD's case, they don't see a need to sell consumer GPUs with a true CUDA analog since their datacenter product is architecturally distinct from their GPUs. Consumers come to AMD for cheap graphics performance, and adding additional hardware on top of the SMs would be a waste of money for many (or most) customers. This is why you see such a rift between CDNA and RDNA chips on compute workloads, and why it's unlikely that we'll see a CUDA-equivalent product out of AMD any time soon.

At some point there will be models that are ‘good enough’ and run on chinese chips, mobile processors, and run of the mill chips from Apple. Whether this is a one bit ternary model, innovations to limit the size of the context, or something else it is coming. The balance has already shifted to making these systems less resource intensive which is a clear need based on the enormous data center cost.
AMD should have been ideally placed to compete with them, and haven't.

> NVIDIA engineers code against a well-defined test suite/specification, right?

The spec is the value. And the patents.

I admit I'm not too knowledgeable about the semiconductor industry. But it seems to me that there two likely scenarios: AI Bear or AI Bull.

In the AI Bear scenario, NVIDIA is obviously overvalued.

In the AI Bull scenario, we get full automation of software engineering. With "just a few clicks", an AMD employee can extract and replicate whatever subset of the spec is needed for AI workloads. Didn't the Google vs Oracle case find that copying an API can be fair use? And NVIDIA's patents haven't stopped Google from training on TPUs have they?

The most reasonable story you can tell for a nVidia moat is their know-how in designing datacenter-scale hardware and getting it fabbed and deployed. That's inherently hard to replicate. CUDA itself can be replicated in theory (it's basically just a compute API) but that turns out not to be worth it since the nVidia ecosystem really is higher quality for the cost.
The thing to squint at is Microsoft Windows. Windows is just a small bit of software, like Cuda for Nvidia is. Shouldn't be that hard to copy. Look at Ubuntu. Look at how functional a replacement for Windows it is and how much that hasn't mattered one bit.
Only on HN will people doubt the moat of a company with >5T market cap at an annualized revenue of 400B with YoY growth close to 100%. Yes, bubbles gonna bubble, but what?
Annualized revenue and YoY growth have little to do with long-term moat width
I dont think that holds since the core cuda toolkit is proprietary