Hacker News new | ask | show | jobs
by lolinder 534 days ago
Heck, I'm willing to pay $3000 for one of these to get a good model that runs my requests locally. It's probably just my stupid ape brain trying to do finance, but I'm infinitely more likely to run dumb experiments with LLMs on hardware I own than I am while paying per token (to the point where I currently spend way more time with small local llamas than with Claude), and even though I don't do anything sensitive I'm still leery of shipping all my data to one of these companies.

This isn't competing with cloud, it's competing with Mac Minis and beefy GPUs. And $3000 is a very attractive price point in that market.

2 comments

Have you been to the localLlama subreddit? It’s a great resource for running models locally. It’s what got me started.

https://www.reddit.com/r/LocalLLaMA/

Yep! I don't spend much time there because I got pretty comfortable with llama before that subreddit really got started, but it's definitely turned up some helpful answers about parameter tuning from time to time!
I'm pretty frugal, but my first thought is to get two to run 405B models. Building out 128GB of VRAM isn't easy, and will likely cost twice this.
You can get a M4 Max MBP with 128GB for $1k less than two of these single-use devices.
These are 128GB each. Also, Nvidias inference speed is much higher than Apple's.

I do appreciate that my MBP can run models though!

I read the Nvidia units are 250 Tflops vs the M4 Pro 27 Tflops. If they perform as advertised i'm in for two.
Don't these devices provide 128GB each? So you'd need to price in two Macs to be a fair comparison to two Digits.
But then you have to use macOS.