Hacker News new | ask | show | jobs
by CuriouslyC 707 days ago
You can beat gpt4/claude in terms of price/performance for most things by a mile using fine tuned models running in a colo. Those extra parameters give the chatbots the ability to understand malformed input and to provide off the cuff answers about almost anything, but small models can be just as smart about limited domains.
1 comments

The problem is that once you say “fine tuned” then you have immediately slashed the user base down to virtually nothing. You need to fine-tune per-task and usually per-user (or org). There is no good way to scale that.

Apple can fine-tune a local LLM to respond to a catalog of common interactions and requests but it’s hard to see anyone else deploying fine-tuned models for non-technical audiences or even for their own purposes when most of their needs are one-off and not recurring cases of the same thing.

Not necessarily, you can fine tune on a general domain of knowledge (people already do this and open source the results) then use on device RAG to give it specific knowledge in the domain.