| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zozbot234 124 days ago
	> We still need good value hardware to run Kimi/GLM in-house If you stream weights in from SSD storage and freely use swap to extend your KV cache it will be really slow (multiple seconds per token!) but run on basically anything. And that's still really good for stuff that can be computed overnight, perhaps even by batching many requests simultaneously. It gets progressively better as you add more compute, of course.

2 comments

Aurornis 124 days ago

> it will be really slow (multiple seconds per token!)

This is fun for proving that it can be done, but that's 100X slower than hosted models and 1000X slower than GPT-Codex-Spark.

That's like going from real time conversation to e-mailing someone who only checks their inbox twice a day if you're lucky.

link

zozbot234 124 days ago

You'd need real rack-scale/datacenter infrastructure to properly match the hosted models that are keeping everything in fast VRAM at all times, and then you only get reasonable utilization on that by serving requests from many users. The ~100X slower tier is totally okay for experimentation and non-conversational use cases (including some that are more agentic-like!), and you'd reach ~10X (quite usable for conversation) by running something like a good homelab.

link

HPsquared 124 days ago

At a certain point the energy starts to cost more than renting some GPUs.

link

vardalab 124 days ago

Yeah, that is hard to argue with because I just go to OpenRouter and play around with a lot of models before I decide which ones I like. But there's something special about running it locally in your basement

link

dotancohen 124 days ago

I'd love to hear more about this. How do you decide that you like a model? For which use cases?

link

fc417fc802 124 days ago

Aren't decent GPU boxes in excess of $5 per hour? At $0.20 per kWhr (which is on the high side in the US) running a 1 kW workstation 24/7 would work out to the same price as 1 hour of GPU time.

The issue you'll actually run into is that most residential housing isn't wired for more than ~2kW per room.

link