| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by embedding-shape 107 days ago
	But why would I want to results to be done faster but less reliable, vs slower and more reliable? Feels like the sort of thing you'd favor accuracy over speed, otherwise you're just degrading the quality control?

3 comments

bigyabai 107 days ago

The high-nines of fruit organization are usually not worth running a 400 billion parameter model to catch the last 3 fruit.

link

CamouflagedKiwi 106 days ago

It's not that you want it to be faster, but you want the latency to be predictable and reliable, which is much more the case for local inference than sending it away over a network (and especially to the current set of frontier model providers who don't exactly have standout reliability numbers).

link

embedding-shape 106 days ago

> which is much more the case for local inference than sending it away over a network

Of course, but that isn't what unclear here.

What's unclear is why a 7b LLM model would be better for those things than say a 14b model, as the difference will be minuscule, yet parent somehow made the claim they make more sense for verification because somehow latency is more important than accuracy.

link

0cf8612b2e1e 107 days ago

Local, offline system you control is worth a lot. Introducing an external dependency guarantees you will have downtime outside of your control.

link

embedding-shape 106 days ago

Right, but that doesn't answer why you'd need a fast 7b LLM rather than a slightly less fast 14b LLM.

link

0cf8612b2e1e 106 days ago

In the hypothetical fruit sorting example, if you have a hard budget of 10 msec to respond and the 7B takes 8 msec and the 14B takes 12msec, there is your imaginary answer. Regular engineering where you have to balance competing constraints instead of running the biggest available.

link

jwatte 106 days ago

Hard real time is a thing in some systems. Also, the current approaches might have 85% accuracy -- if the LLM can deliver 90% accuracy while being "less exact" that's still a win!

link

IanCal 106 days ago

Can you fit the 14B on the device they're using? That feels rather important.

And then it depends on whether there is a useful difference in performance between the two.

link

0xbadcafebee 106 days ago

....because sometimes people need a faster answer? There's many possible reasons someone might need speed over accuracy. In the food sorting example, if lower accuracy means you waste more peanuts, but the speed means you get rid of more bad peanuts overall, then you get fewer complaints about bad peanuts, with a tiny amount of extra material waste.

link