Hacker News new | ask | show | jobs
by lumost 183 days ago
A big challenge is that the LLM cannot selectively sample it's training set. You don't forget what a coffee cup looks like just because you only drank water for a week. LLMs on the other hand will catastrophically forget anything in their training set when the training set does not have a uniform distribution of samples in each batch.