| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by arbfay 372 days ago

Before post-ChatGPT boom, we used to talk of "catastrophic forgetting"...

Make sure the new training dataset is "large" by augmenting it with general data (see it as a sample of the original dataset), use PEFT techniques (freezing weights => less risks), use regularization (elastic weight consolidation).

Fine-tuning is fine, but will be more expensive that you thought and should be led by more experienced ML engineers. You probably don't need to fine tune models anyway.