Hacker News new | ask | show | jobs
by brucethemoose2 912 days ago
Inference is going to be interesting in 2025.

By that time we will have a good number of MI300 hosts. AMD Strix Halo (and the Intel equivalent?) will be out for high memory jobs locally. Intel Falcon Shores and who knows will finally be coming out, and from the looks of it the software ecosystem will be at least a little more hardware agnostic.

1 comments

https://www.amd.com/en/products/apu/amd-ryzen-7-7840hs this has existed since 2023 with "Neural Processing Unit".

Seems like if you want to catch the wave, it's really already here. Not sure what this thing can do, and hope to find out next year, but local AI is going to be a killer app.

The NPU is reportedly slower than the IGP (just like in Apple devices) and restricted in what it can do. Both it and the GPU are bottlenecked by the 128 bit memory bus anyway. The 7840HS is just not fast in generative AI, even in the best case scenario.

Halo Strix's memory bus will be twice as wide, higher speed, and the GPU will be much bigger. It will be closer to a small GPU with a huge VRAM pool, rather than a dreadfully slow IGP.