Hacker News new | ask | show | jobs
by WhiteNoiz3 55 days ago
Iirc, the model is still pre-trained on modern text before being fine tuned on 1930's material, so it's possible it still has some knowledge of words that didn't exist back then. Edit: looks like they make some attempt to filter out documents from the pre-training but it's still possible some sneak in.
1 comments

There could be a leak from post-training but not pre-training