Hacker News new | ask | show | jobs
by mikeagb 993 days ago
I agree that stating outright that the answer is no is a bit too strong of a statement. The general consensus has definitely been that fine-tuning (especially instruction fine-tuning) is primarily to pick up style over facts, but that doesn't mean it's not doable. Continuous pre-training is used to instill new knowledge, and the line where it becomes "fine-tuning" rather from "continuous pre-training" is not obvious to me.
1 comments

This line between fine tuning and continuous pre-training is what I’m interested in. What is the investment difference between fine tuning, contoured-training and training from scratch? Do you (or anyone else) have any good sources or know of good examples where continuous pre-training is being done?