Hacker News new | ask | show | jobs
by znpy 55 days ago
As somebody that has a vague interest in running local LLMs… they day i decide to burn cash on hardware I might as well go all-in a get either a 128gb mac studio or an nvidia dgx spark (or some other equivalent gb10-based system).

The 64gb mac mini is also interesting, if anything because it is very likely to hold most of its value when reselling.

I’m keeping an eye on the next apple hardware refreshes, particularly for mac minis and mac studios.

4 comments

I am in a similar boat to you, but I can’t make the money math work. Local LLMs obviously have a privacy benefit but DeepSeek V4 Flash (which you’ll struggle to get running on any single Mac - you’d need at least 128gb RAM) is $0.14$/mtok input $0.28/mtok output on the API. You’d have to be just absolutely burning tokens to ever make this make sense.

Mac Studio M4 Max with 128gb at $3,699 (if you can find it) would equate to 10 million tokens a day of mixed input-output for over 5 years to break even. At which point that hardware is outdated compared to the SOTA models that will probably still be cheap on hosted platforms.

The models are good enough now, so I'm waiting for the day they start selling inference ASICs with 100x the token output speed. See Taalas demo.
Taalas is a nice concept, but I don’t want to use the same model forever!
Just buy a new one every few years, just like your phone and laptop. And sell the old one.
At the current rate of changes the old one will sell for almost 0.
I just use my gaming pc. So I can play games or code with assistance for fun. It's awesome because it's mine and technically I can do whatever I want with it. Having a decent computer around and lower end laptops is pretty budget friendly.
The 14inch Macbook Pros with 64GB are really good value considering it's a much more complicated machine than the Mini.
On M5 Pro that's still ~3k