| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dvt 3 days ago

> Even at non VC subsidized $/token prices, its still much cheaper to run cloud based models.

On a price-per-wattage level, this is not true, people have done the math on /r/LocalLLaMA many times over[1]. Local models, while not as good as premier models (GPT 5.5, etc.), are like ~80%+ of the way there, and often converge to a similar solution after a few dead ends.

[1] https://www.reddit.com/r/LocalLLM/comments/1kshq4f/electrici...

1 comments

fwip 3 days ago

Maybe not per watt, but unless you already happen to own a 3900 cited by that post, you'd have to buy that as well, which is currently selling for around $1400 used.

link

strictnein 3 days ago

3090s are running $1400 now? Wowsers. I thought I was overspending when I bought 6x of them for around $800 a pop.

Might be time to sell, to be honest. It's fun to have that at home, but I can't justify having $10k (with memory, mobo, cpu, etc) sitting in my basement without being fully utilized.

link

karim79 3 days ago

I'll take two of them. A thousand a piece.

link

dvt 3 days ago

I do have a 3090 Ti on my gaming PC, but even my old M1 MBP (with a mere 32gb of RAM) is quite competent and can run a quantized `Gemma4-26B-A4B` in the background while I do other stuff.

link

ActorNightly 2 days ago

The MBP running Gemma4 is absolutely is useless for any real work.

link

nozzlegear 2 days ago

What is "real work"?

link

ActorNightly 2 days ago

Where you are developing software. Its significantly faster to use google gemini and copy paste code back and forth compared to having gemini edit files for you.

link

ClikeX 2 days ago

To be fair, I can also use that 3900 for other things locally. Not just AI.

link