| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by noelwelsh 367 days ago
	I think the main limitation, right now, is hardware. For GPUs the main limit is the VRAM available on consumer models. CPUs have plenty of memory but don't have the bandwidth or vector compute power for LLMs. This is why I think the Strix Halo is so exciting: it has bandwidth + compute power plus a lot of memory. It's not quite where it needs to be to replace a dedicated GPU, but in a few iterations it could be. I'm interested in other opinions. I'm no expert on this stuff.

1 comments

jb1991 367 days ago

How does the shared memory model for GPUs on Apple Silicon factor into this? These are technically consumer grade and not very expensive, but they can offer a huge amount of memory since all the memory is shared between CPU and GPU, even a midtier machine can easily have 100 GB of GPU memory.

link

noelwelsh 367 days ago

If you squint the M4 is the same as the Strix Halo. The M4 has roughly

* double the bandwidth;

* half the compute; and

* double the price for comparable memory (128GB)

compared to the Strix Halo.

I'm more interested in the AMD chips because of cost plus, while I have an Apple laptop, I do most of my work on a Linux desktop. So a killer AMD chip works better for me. If you don't mind paying the Apple tax then a Mac is a viable option. I'm not sure on the software side of LLMs on Apple Silicon but I cannot imagine it's unusable.

An example of desktop with the Strix Halo is the Framework desktop (AI Max+ 395 is the marketing name for the Strix Halo chip with the most juice): https://frame.work/gb/en/products/desktop-diy-amd-aimax300/c...

link

ezschemi 367 days ago

I am also very interested in AMD's Strix Halo for running LLMs locally. For that I have a Framework Desktop in order (batch 1!). Alex Ziskind on Youtube does videos comparing Strix Halo, M4 Mac mini and MacBook Pro, Nvidia 5090, etc. including power consumption. The only downside is one has to pull out the numbers from the videos, there's no tables or anything. Here is the recent video with testing Strix Halo and a Mac mini: https://www.youtube.com/watch?v=B7GDr-VFuEo

link

justincormack 367 days ago

Apple has machines with 2x and about 3x the Strix Halo bandwidth by doubling up the memory buses. These get expensive though.

link