Hacker News new | ask | show | jobs
by delichon 387 days ago
> Local models are a thing. You don't need proprietary models and API calls at all for certain uses. And these models get better and better each year.

They are getting better so fast that I'm considering building a business that depends on much lower cost LLM inference. So betting years of effort on it.

But the bet is also that the proprietary models won't run away with faster improvements that make local models uncompetitive even while they improve. Can the local models keep up? They seem to be closing the gap now. Is that the rule or an artifact of the early development phase?

The safer plan may be to pass the inference cost through to the user and let them pick premium or budget models according to their need almost per request, as Zed editor does now.

3 comments

Outside of giant tech companies, there are many researchers with access to little more than a single consumer GPU card. They are highly motivated to reduce the cost of training and inference.
> The safer plan may be to pass the inference cost through to the user and let them pick premium or budget models according to their need almost per request, as Zed editor does now.

I'm working on a solution right now that is using a local/cheap model first, does some validation, and if this validation fails, use the expensive SOTA model. This is the most reasonable approach if you have a way to verify the results somehow (which might not be easy depending on the use case).

It might not matter that proprietary models stay ahead of local, as long as the local models are strong enough for your use case.
The use case is structuring arbitrary natural language, e.g. triple extraction. That seems to benefit from as much context and intelligence as can be applied. "Good enough" remains a case by case judgment.