| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by embedding-shape 106 days ago

> which is much more the case for local inference than sending it away over a network

Of course, but that isn't what unclear here.

What's unclear is why a 7b LLM model would be better for those things than say a 14b model, as the difference will be minuscule, yet parent somehow made the claim they make more sense for verification because somehow latency is more important than accuracy.