|
|
|
|
|
by JoshTriplett
490 days ago
|
|
> If the copyright holders win, the model giants will just license. No, they won't. The biggest models want to train on literally every piece of human-written text ever written. You can pay to license small subsets of that at a time. You can't pay to license all of it. And some of it won't be available to license at all, at any price. If the copyright holders win, model trainers will have to pay attention to what they train on, rather than blithely ignoring licenses. |
|
They genuinely don't. There is a LOT of garbage text out there that they don't want. They want to train on every high quality piece of human-written text they can get their hands on (where the definition of "high quality" is a major piece of the secret sauce that makes some LLMs better than others), but that doesn't mean every piece of human-written text.