Hacker News new | ask | show | jobs
by AStonesThrow 611 days ago
LLMs don't necessarily need to reproduce their source material to make use of it. They could summarize, analyze, condense, paraphrase, extract statistics or factoids. There's also the question of how the models actually store the source material or not. It's physically impossible for the verbatim text to live in the model weights, and so at the very least, it's compressed or abstracted. So any copyright claims will need to get beyond a simplistic allegation of copying, for sure.