Hacker News new | ask | show | jobs
by bhickey 904 days ago
Disclaimer: ibid

> It can't just crawl Wikipedia because Fair Use in Wikipedia doesn't constitute Fair use for a commercial AI model.

Why not? The lawyers I've discussed this with socially think that questions like this are unresolved. There are certainly competing legal theories, but we're in uncharted territory. No one knows what the outcome will be until rulings come down or Congress acts.

I find the NYT's argument a little hokey. Where are the damages? No one is using ChatGPT to read NYT articles and the residual value of day old news stories is close to zero.

2 comments

> > It can't just crawl Wikipedia because Fair Use in Wikipedia doesn't constitute Fair use for a commercial AI model. > Why not?

Because it’s tantamount to lying and deceptive conduct? It’s like asking for a licence to use something non-commercially, getting a hold of it, and conveniently deciding 10 minutes later, that you’re actually going to become a re-seller for all this stuff you have. Or going to the soup kitchen because you don’t want to pay your private chef tonight.

This analogy doesn't work. Fair use is an affirmative defense to copyright infringement claims. Entities that are training models largely claim that their uses are transformative and fall under fair use. Creative Commons, among others, agrees with this position. [0] If they're right, it simply doesn't matter what license a copyright holder is offering.

There are competing legal theories and no one can say how courts are going to rule on these issues. Smart lawyers who work on copyright and AI don't know. Technologists certainly don't know.

[0] https://creativecommons.org/2023/02/17/fair-use-training-gen...

Then there is the argument that the rules around fair use aren't even reached because the training of the model doesn't even do anything that requires a fair use exemption.
That's a good point. I agree it's not clear cut one way or another and we gotta let it play out.