Hacker News new | ask | show | jobs
by MrArthegor 74 days ago
A good technical project, but honestly useless in like 90% of scenarios.

You want to use an NVidia GPU for LLM ? just buy a basic PC on second hand (the GPU is the primary cost anyway), you want to use Mac for good amount of VRAM ? Buy a Mac.

With this proposed solution you have an half-backed system, the GPU is limited by the Thunderbolt port and you don’t have access to all of NVidia tool and library, and on other hand you have a system who doesn’t have the integration of native solution like MLX and a risk of breakage in future macOS update.

9 comments

Chicken/egg. NVidia tooling is lacking surely in part because the hardware wasn’t usable on macOS until now. Now that it’s usable that might change.
Nvidia GPUs were usable on Intel Macs, but compatibility got worse over time, and Apple stopped making a Mac Pro with regular PCIe slots in 2013. People then got hopeful about eGPUs, but they have their own caveats on top of macOS only fully working with AMD cards. So I've gotten numb to any news about Mac + GPU. The answer was always to just get a non-Apple PC with PCIe slots instead of giving yourself hoops to jump through.
The 2019 Intel Mac Pro had PCIe slots. The Apple Silicon Mac Pro still has them as well, but they’re pretty much useless.
Nvidia tooling like CUDA has worked on AArch64 UNIX-certified OSes since June of 2020: https://download.nvidia.com/XFree86/Linux-aarch64/

The software stack has been ready for Apple Silicon for more than a half decade.

Until there is official support for Mac coming from nvidia, I don't think anything will happen.

> the hardware wasn't usable on macOS

This eGPU thing is from a third-party if I understand correctly. I don't see why nvidia would get excited about that. If they cared about the platform, they would have released something already.

The eGPU "thing" should work on anything that supports thunderbolt as it has native support for pcie.
The point is that if nvidia cared about Mac platform they would have done something to make eGPU usable on Mac a long time ago.

Even on Intel Macs using eGPU with nvidia cards was near impossible. nvidia just doesn't care about it after the breakdown of the two companies' relationship.

Whether a third party has created a signed driver or not doesn't matter much until there is more interest from the GPU maker. This barely moves the needle.

Wrong.

If a model can run on a 512GB M3 Ultra via MLX or CUDA, but simultaneously benefit from the memory bandwidth of something like an RTX 6000 Pro; that would save my company hundreds of thousands of dollars. That's $20,000 for roughly 600GB of VRAM, and enough token generation speed to fulfill the needs of any enterprise that's not a hyperscaler or neocloud.

I'll let someone else do the math for you on what it costs to put together a 10U server to get that kind of performance without the $10K M3 Ultra Studio.

What we're paying for five old 80GB A100s is criminal, but it's nothing compared to what these GB200 Blackwell setups are going to cost in 2030. Market economics aside, the fact that they require sophisticated liquid cooling infrastructure and draw 3x the power of the A100s, will make these cards unattainable for small to medium organizations.

So yeah, if there's some outside chance that we can pair NVIDIA's speed with a an arm-powered machine that offers 512GB Unified Memory while drawing 50W -- you better believe it's a big deal. We'll see. Sounds too good to be true.

Thank you for opening my mind to a viewpoint I didn’t even know existed.

Yes, for many scenarios this is "not even an academic exercise".

For a very select few applications this is Gold. Finally serious linear algebra crunch for the taking. (Without custom GPU tapeout.)

"Nvidia." Not NVidia or nVidia, or the other ways. I feel that I can frequently figure out if someone is going to express a negative view about this company based only on whether they picked a weird way to write their name.
Their logo literally has a lowercase "N" in their name.
I misunderstood eGPU for virtual GPU. But I was wrong it means external GPU.
> the GPU is limited by the Thunderbolt port

Not everything is limited by the transfer speed to/from the GPU. LLM inference, for example.

> GPU is limited by the Thunderbolt port

I thought Thunderbolt was like pluggable PCI? The whole point was not to limit peripherals.

There's more to peripheral limits than the protocol used. Thunderbolt connections offer higher latency and limits on bandwidth. Both, either, or neither of those things may be much of an actual problem (depending on the use case) but they are some examples of limits vs native PCIe.
Even with running ML experiments you'd mostly want to run them on rented out clusters anyway
the tooling is just the standard linux tooling inside the container, no? and thunderbolt is not a real limitation