Hacker News new | ask | show | jobs
by LeifCarrotson 935 days ago
The article describes how the deployed model can regurgitate chunks of copyrighted works - one of the samples literally ends in a copyright notice.
1 comments

If these were copyrighted works, how did these end up in the public comparison dataset?

Sure, some copyrighted works ended up in the Pile by accident. You can download these directly, without the elaborate "poem" trick.