|
|
|
|
|
by viraptor
770 days ago
|
|
Does it matter? What's the legal view on "I downloaded some data which turns out to be copied from a copyrighted source and it was probably trivial to figure it out, then trained the LLM on it"? I mean, they work on data processing - of course they would expect that if someone responds with 10 paragraphs in reporting style, under a link to NYT... that's just the article. |
|
If Stack Overflow collects a bunch of questions and comments and expose them as a big dataset licensed as Creative Commons but it actually contains a quite bit of copyrighted content, whose responsibility is it to validate copyright violations in that data? If I use something licensed as CC in good faith and it turns out the provider or seller of that content had no right to relicense it, am I culpable? Is this just a new lawsuit where I can seek damages for the lawsuit I just lost?