| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by digitaltrees 35 days ago
	I wonder if it really needs to be worse. I am playing with the idea of fine tuning a model on my exact stack and coding patterns. I suspect I could get better performance by training “taste” into a model rather than breadth.

3 comments

epicureanideal 35 days ago

I also wonder about JS only, Python only, etc models.

Maybe the future is a selection of local, specific stack trained models?

link

robrenaud 34 days ago

There is some recent work on modularizing knowledge in LLMs.

https://arxiv.org/html/2605.06663v1

It might be possible to train a big generalist that is a composition of modules, some of which can be dropped dynamically at inference time, depending on the prompt.

link

digitaltrees 29 days ago

Cool. Thanks for sharing. I am thinking about creating a series of smaller models for specific purposes and then orchestrating them so they mirror the human brain which is a bunch of subsystems that give multiple opinions about the same stimulus

link

shailendra_sis 29 days ago

Interesting direction. I’ve also been thinking about modular / subsystem-based approaches for specialized tasks in small AI systems.

link

andy_ppp 35 days ago

These models being able to generalise at coding will likely get worse if you remove high quality training data like all of python.

link

jimbokun 34 days ago

That approach has its advantages, but sometimes I want to generate code for a language or kind of project I’m not experienced with using the accepted best practices.

link

andy_ppp 35 days ago

Fine tuning these models (at least with PPO or equivalent) requires even more VRAM than inference does, potentially 2-3 times more.

link

rusk 34 days ago

You could use PEFT? Operating on only a subset of weights is fairly standard practice nowadays …

link

andy_ppp 34 days ago

Yes I used LoRA and it’s fine but I’m not convinced the model doesn’t end up more stupid and less general

link