Hacker News new | ask | show | jobs
by Decabytes 633 days ago
Since most people can’t run these LLMs locally, I wonder what a model would look like where we have hyper tuned models for specific purposes, IE a model for code, a model for prose, etc. you have a director model that interprets what downstream model should be used and then it runs that. That way you can run the model locally, without needing beefy GPUs. It’s a trade off of using more disk space vs needing more vram
4 comments

The whole point of this model is that it's so tiny that even a weak RPi could run it. Apple has also done some interesting work with a common <4B base model that is customized with different LoRAs for different purposes.
If you're using a JetBrains IDE, the AI based autocompletions are powered by super tiny LLMs, each trained on a single language. This allows them to run locally and still product decent results.

For example, the C++ model is really good at writing both OpenGL+GLFW and Raylib.

You're essentially describing Apple Intelligence :-)

https://machinelearning.apple.com/research/introducing-apple... (see Model Adaptation)

A rip off of LLMs and loras. Wrapping it in a shiny sounding name for the normies doesn't mean they contributed anything to the space.
They're not hiding anything; they've very clearly described what they've done and how they've done it.

They've branded their specific architecture and integration, which allows me to easily refer to it as an example.

I understand that it's easy to be cynical about Apple's approach to product development, but it seems unwarranted in this case.

>IE a model for code

That's already very much a thing. Codestral, Phind, Starcoder etc.

Fine tuning models on whatever you want is quite accessible if you have a good dataset and a 100 bucks of budget