| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by abtinf 20 days ago

Insofar as I can tell, inference is on a certain path toward becoming "free". The models are now extremely powerful on high-end consumer hardware, and the efficiency trend seems likely to continue.

Here is a recent non-rigorous benchmark I ran against a bunch of models. Qwen3.6 35B A3B fine-tuned with opus data runs plenty fast on my local machine and produce outstanding results - easily in the top 5, comparable to GPT 5.5 Pro (which is $180/mtok).

https://gistpreview.github.io/?31d66ef69e4aed3efae1aec69d86c...

I've predicted for years now that the industry will head down the path of the virus scanning vendors: selling subscriptions to be able to download the latest versions of models. I simply don't see how any other business model is remotely viable, except at the very highest end of inference or video gen.

1 comments

anonzzzies 20 days ago

That local hardware is not consumer though but prosumer. Consumer is a 500$ laptop running that and that is not currently the case.

link