|
|
|
|
|
by brabel
757 days ago
|
|
> But if openai had started out by seeking permission to train on any and every piece of content out there... But why would anyone seek permission to use public data? Unless you've got Terms and Conditions on reading your website or you gatekeep it to registered users, it's public information, isn't it? Isn't public information what makes the web great? I just don't understand why people are upset about public data being used by AI (or literally anything else. Like open source, you can't choose who can use the information you're providing). In the case being discussed here, it's obviously different, they used the voice of a particular person without their consent for profit. That's a totally separate discussion. |
|
first of all it's not all public data. software licenses should already establish that just because something is on the internet doesn't mean it's free game.
>Unless you've got Terms and Conditions
The new york times did:
https://help.nytimes.com/hc/en-us/articles/115014893428-Term...
Even if you want to bring up an archive of the pre-lawsuit TOS, I'd be surprised if that mostly wasn't the same TOS for decades. OpenAI didn't care.
>Isn't public information what makes the web great?
no. Twitter is "public information" (not really, but I'll go with your informal definition here). If that's what "public information" becomes then maybe we should curate for quality instead of quantity.
Spam is also public information and I don't need to explain how that only makes the internet worse. and honestly, that's what AI will become if left unchecked.
> Like open source, you can't choose who can use the information you're providing
That's literally what software licenses are for. You can't stop people from ignoring your license, but breaking that license opens you wide open for lawsuits.