Hacker News new | ask | show | jobs
by mrbabbage 1094 days ago
I totally agree that it's not obvious that an ML model is a derivative work. the language of the Copyright Act uses "recast, transformed, or adapted" to describe derivative works, and a pile of model weights isn't clearly that, IMO. I think it's fair to say that inferences directly replicating the creative and expressive elements (because factual information isn't copyrightable!) of a copyrighted work infringe. but I don't think it's obvious that the model itself does.

> If one goes to their local library and scans all the books there to generate the models used to OCR text, does that make the OCR model and application derivative works of the books?

there is a court case [1] addressing an even more infringing use case: scanning and OCR'ing books to produce a searchable database. that case turned on fair use, however, and not whether the database was a derivative work.

[1] https://en.wikipedia.org/wiki/Authors_Guild,_Inc._v._Google,....