|
|
|
|
|
by pjc50
314 days ago
|
|
Which is an interesting view when applied to the IP. I think it's relatively uncontroversial that an MP4 file which "predicts" a Disney movie which it was "trained on" is a derived work. Suppose you have an LLM which was trained on a fairly small set of movies and you could produce any one on demand; would that be treated as a derived work? If you have a predictor/compressor LLM which was trained on all the movies in the world, would that not also be infringement? |
|
An LLM is (or can be used) as a compression algorithm, but it is not compressed data. It is possible to have an overfit algorithm exactly predict (or reproduce) an output, but it’s not possible for one to reproduce all the outputs due to the pigeonhole principle.
To reiterate - LLMs are not compressed data.