| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by m00x 564 days ago

Because the CPU has to load the model in parts for every cycle so you're spending a lot of time on IO and it offsets processing.

You're talking about completely different things here.

It's fine if you're doing a few requests at home, but if you're actually serving AI models, CUDA is the only reasonable choice other than ASICs.

1 comments

treprinum 564 days ago

My comment was about Intel having a starter project, getting enthusiastic response from devs, network effects and iterate from there. They need a way to threaten Nvidia and just focusing on what they can't do won't bring them there. There is one route where they can disturb Nvidia's high end over time and that's a cheap basic GPU with lots of RAM. Like Ryzen 1st gen whose single core performance was two generations behind Intel trashed Intel by providing 2x as many cores for cheap.

link

m00x 564 days ago

It would be a good idea to start with some basic understanding of GPU, and realizing why this can't easily be done.

link

treprinum 564 days ago

That's a question M3 Max with its internal GPU already answered. It's not like I didn't do any HPC or CUDA work in the past to be completely clueless about how GPUs work though I haven't created those libraries myself.

link

m00x 564 days ago

What have you implemented in CUDA?

link