Hacker News new | ask | show | jobs
by xp84 1069 days ago
Ok. I guess we just disagree then. In my view, that model doesn’t “contain” the works. It contains lists of numbers (and not in the ASCII sense that a silly rebuttal would make, I mean only the tokens and weights) and not “pieces of” the books. If I published a statistical analysis of word frequency in your books, I don’t think you’d have a slam dunk CI case against me. Even if someone could use those to generate some passages of your book. It certainly can’t generate the whole book, we can plainly see that (otherwise OpenAI has actually invented magic compression). Just as if you sold consulting services, and employed people who had read those books many times and sold your service to budding fantasy authors to help them write better, those consultants are not themselves derivative works just because they learned the material. The derivative work would be those people’s output (if it rips off that material).