| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zozbot234 62 days ago
	Mind you, a 30B model (3B active) is not going to be comparable to Opus. There are open models that are near-SOTA but they are ~750B-1T total params. That's going to require substantial infrastructure if you want to use them agentically, scaled up even further if you expect quick real-time response for at least some fraction of that work. (Your only hope of getting reasonable utilization out of local hardware in single-user or few-users scenarios is to always have something useful cranking in the background during downtime.)

3 comments

pitched 62 days ago

For a business with ten or more engineers/people-using-ai, it might still make sense to set this up. For an individual though, I can’t imagine you’d make it through to positive ROI before the hardware ages out.

link

zozbot234 62 days ago

It's hard to tell for sure because the local inference engines/frameworks we have today are not really that capable. We have barely started exploring the implications of SSD offload, saving KV-caches to storage for reuse, setting up distributed inference in multi-GPU setups or over the network, making use of specialty hardware such as NPUs etc. All of these can reuse fairly ordinary, run-of-the-mill hardware.

link

DeathArrow 62 days ago

Since you need at least a few of H100 class hardware, I guess you need at least few tens of coders to justify the costs.

link

pitched 62 days ago

I see the 512GB Mac Studios aren’t for sale anymore but that was a much cheaper path

link

cyberax 62 days ago

I'm backing up a big dataset onto tapes, so I wanted to automate it. I have an idle 64Gb VRAM setup in my basement, so I decided to experiment and tasked it with writing an LTFS implementation. LTFS is an open standard for filesystems for tapes, and there's an implementation in C that can be used as the baseline.

So far, Qwen 3.6 created a functionally equivalent Golang implementation that works against the flat file backend within the last 2 days. I'm extremely impressed.

link

Gareth321 62 days ago

It is surprisingly competent. It's not Opus 4.6 but it works well for well structured tasks.

link

wuschel 62 days ago

What near SOTA open models are you referring to?

link