Y
Hacker News
new
|
ask
|
show
|
jobs
by
sitkack
517 days ago
All of the most capable models I use have been clearly trained on the entirety of libgen/z-lib. You know it is the first thing they did, it is like 100TB.
Some of the models are even coy about it.
1 comments
zaptrem
516 days ago
The models are not self aware of their training data. They are only aware of what the internet has said about previous models’ training data.
link
sitkack
516 days ago
I am not straight up asking them. We know the pithy statement about that word.
link