Hacker News new | ask | show | jobs
by meiraleal 590 days ago
If that's the case, why we aren't seeing yet specialized LLMs for say only JavaScript, or translating from english to portuguese, etc?
2 comments

We are likely going to get there. Similar to the steam/combustion engines (and other core technologies like computers, wireless transmission etc) there's first a massive rush to increase the power of it, at the cost of efficiency and effectiveness for more niche use cases. Then it is specialised to various use cases with large improvements in efficiency and effectiveness. My own prediction for where most gains will now come is

1) Creating new "harnesses" for models that connect to various systems, APIs, frameworks, etc. While this sounds "trivial", a lot of gains can come from this. Similar to how the voice version of ChatGPT was (apparently) amazing, all you really had to do was create an additional voice to text layer and another text to voice layer.

2) Increasing specialisation of models. I predict over time that end user AI companies (e.g those that just use models and not develop them), will use more and more specialised models. The current, almost monolithic, system where every service from text summary to homework help is plugged into the same model will slowly change.

We kind of have, that's what fine tuning is trying to achieve.

We haven't seen wholesale specialised models yet because creating foundation models is expensive and difficult and the current highest ROI is to make a general model.