| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by changoplatanero 866 days ago
	Doesn't nvidia have huge margins? so if someone just makes a clone of the nvidia gpu then it can erode their margins and drive down the cost of compute

4 comments

hashxyz 866 days ago

AMD will succeed at this as long as they keep it together.

link

yen223 866 days ago

Everytime I'm tempted to think software is easy compared to hardware, I just remember that AMD is leaving about a trillion dollars worth of market cap on the table, because they haven't figured out a good alternative to CUDA.

link

greenknight 866 days ago

They are definetly putting a lot of effort into ROCm & HIP, but definetly accelerating.

ROCm 6 was out Dec 16 (2023), 5.5 was May (2023). 5 was Feb 10 (2022). 4 was Dec 19 (2020)

link

patfla 866 days ago

Fred Brooks wrote in The Mythical Man-Month that it's harder (more time-consuming) to produce the software that corresponds to a given hardware. In 1975.

link

SkyMarshal 866 days ago

Hardware was much simpler and less complex then than now. I wonder how or if that's changed by going from hundreds or thousands of transistors to billions.

link

fnordpiglet 866 days ago

They’ll need to either reverse engineer CUDA or incentivize reimplementation of everything out there to use ROCm/OpenCL and forgo all the work load optimization done for Nvidia GPUs. I think that’s a non trivial moat.

link

moralestapia 866 days ago

This has been my perception of AMD for the past 20 years. First against Intel, then ARM, now NVIDIA. "If only ..."

link

Cacti 866 days ago

The real bitch is you also need to replicate both the software and convince some large projects (eg, pytorch) to use and support your implementation, and it’s just all rough, very complicated, very fine-grained stuff. The hurdles here are very high.

And if you fuck that part up in any one of a dozen places, no one will use it, because the adoption cost is too high, or your implementation was 20% slower and so everything costs 20% more to use and no one uses it.

This is why you see things like TPUs never really damage NVIDIA, but why basically everyone is focused on open standards and open software. Basically the entire tech industry is using this approach as a way to slowly peel away the layers of this software until enough has been removed that NVIDIA can no longer use it as a moat.

link

jszymborski 866 days ago

While I doubt OpenAI will be a good fit for semiconductors, my understanding is PyTorch and TensorFlow have been really good at embracing new accelerators, largely due to XLA.

PyTorch, TF, and JAX work great on TPUs. Adoption is low bc they are not really available outside the Google cloud.

link

coredog64 865 days ago

AWS uses tricks to accelerate PyTorch with Inferentia/Trainium. Haven’t used it, but I have tried the equivalent for Apple silicon and rage quit after wasting half a day.

link

Cacti 865 days ago

I mean, it took almost a decade to get there.

link

jszymborski 865 days ago

Right, but that was for XLA no? I think (not an expert) that it compiles code from franeworks into a lower-level IR.

That's gotta be way easier, no?

link

bbcc90 866 days ago

If you are going to go vertical then do it properly.

OpenAI could just build their own framework for internal use that works well on their silicon (see Jax+tpu)

Their starting point? Triton plus some triton libs. Jax chipped away at TF like this, and no reason why Triton can’t do the same to PyTorch.

link

7e 866 days ago

Competitors don't have access to the process node. You'll get competitors, but they won't be as fast or able to run the latest models. That means they'll compete with older versions of NVIDIA's chips.

link

dsalzman 866 days ago

Agreed. commoditizing the complement of OpenAIs models.

link