Hacker News new | ask | show | jobs
by spenczar5 1135 days ago
How feasible and effective would it be to fine-tune a model against an organization's private source code, resulting in an "internal" model that knows how to work with that org's stuff?

Could you, say, fine-tune the model every week with the latest merges? Every hour?

2 comments

Finetuning is a relatively quick process. Training the base model is the expensive part (can take weeks and huge amounts of compute), whereas finetuning usually is only on the last few layers and can be done with much less resources. You could definitely have a "nightly" finetune model that is retrained every day or so.
Interesting - how would that work for a company that wanted to run their own codex model, on-prem, trained on their own code? Perhaps also trained on their dependencies?
Finetuning a smaller model leading to better performance seems like a significant finding that'll lead to a lot of companies fine-tuning their own internal "ChatGPT"s