Hacker News new | ask | show | jobs
by vmh1928 1079 days ago
Isn't part of the problem that some of the training data is retained by the model and used during response generation? In that case it's not just that the copyrighted book was used as training data but that some part of the book has been retained by the model. So now my model is using copyrighted material while it runs. Here's an example of a model that retained enough image data to reconstruct a reasonable facsimile of the training image.

https://www.theregister.com/2023/02/06/uh_oh_attackers_can_e...