| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sitkack 517 days ago
	All of the most capable models I use have been clearly trained on the entirety of libgen/z-lib. You know it is the first thing they did, it is like 100TB. Some of the models are even coy about it.

1 comments

The models are not self aware of their training data. They are only aware of what the internet has said about previous models’ training data.

I am not straight up asking them. We know the pithy statement about that word.