| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by neutronicus 308 days ago
	I think the question is, can I throw a couple thousand bucks of GPU time at fine-tuning a model to have knowledge of our couple million lines of C++ baked into the weights instead of needing to fuck around with "Context Engineering". Like, how feasible is it for a mid-size corporation to use a technique like LoRA, mentioned by GP, to "teach" (say, for example) Kimi K2 about a large C++ codebase so that individual engineers don't need to learn the black art of "context engineering" and can just ask it questions.

1 comments

pu_pe 308 days ago

I'm curious about it too. I think there are two bottlenecks, one is that training a relatively large LLM can be resource-intensive (so people go for RAGs and other shortcuts), and making it finetuned to your use cases might make it dumber overall.

link

koakuma-chan 308 days ago

> making it finetuned to your use cases might make it dumber overall.

LoRa doesn't overwrite weights.

link

pu_pe 308 days ago

Do you need to overwrite weights to produce the effect I mentioned above?

link

koakuma-chan 307 days ago

Good point

link