| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by glitchc 687 days ago

Indeed, you can strip out a whole host of things from the GPU, the framebuffer, the Z-buffer, the transform and lighting engine, instead filling it with more CUDA cores and a higher bandwidth memory controller with a larger bus, etc.

And, as it happens, that's exactly what NVidia's done with the H100: https://developer.nvidia.com/blog/nvidia-hopper-architecture...

It still needs to be programmable though. Can't get away from that.

2 comments

kolinko 687 days ago

You can get away from that if you constrain it to a specific type of models (say attention based).

link

adastra22 687 days ago

You don’t need general programmability for AI inference.

link

glitchc 687 days ago

The money's in the training, not the inference.

If you look at Apple and Google, they already have their own hardware for inference in their smartphones. They don't need NVidia for that.

link

mupuff1234 687 days ago

Apple and Google use TPUs for training.

https://www.cnbc.com/2024/07/29/apple-says-its-ai-models-wer...

link

glitchc 687 days ago

Hmmm, that's worse for NVidia.

link

adastra22 686 days ago

NVIDIA owns the interconnects that are used for this training. I’m sure they have their own competing AI accelerator they are working on too.

link

adastra22 686 days ago

You don’t need programmability for AI teaining either.

link