| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by WhiteNoiz3 55 days ago
	Iirc, the model is still pre-trained on modern text before being fine tuned on 1930's material, so it's possible it still has some knowledge of words that didn't exist back then. Edit: looks like they make some attempt to filter out documents from the pre-training but it's still possible some sneak in.

1 comments

There could be a leak from post-training but not pre-training