|
I don't want to defend Altman. He may or may not be a good actor. But as an engineer, I love the idea of building something magical, yet lately that's not straightforward tinkering - unless you force your way - because people raise all sorts of concerns that they wouldn't have 30 years ago. Google (search) was built on similar data harvesting and we all loved it in the early days, because it was immensely useful. So is ChatGPT, but people are far more vocal nowadays about how what it's doing is wrong from various angles. And all their concerns are valid. But if openai had started out by seeking permission to train on any and every piece of content out there (like this comment, for example) they wouldn't have been able to create something as good (and bad) as ChatGPT. In the early search days, this was settled (for a while) via robots.txt, which for all intents and purposes openai should be adhering to anyway. But it's more nuanced for LLMs, because LLMs create derivative content, and we're going to have to decide how we think about and regulate what is essentially a new domain and method and angle on existing legislation. Until that happens, there will be friction, and given we live in these particular times, people will be outraged. That said, using SJ's voice given she explicitly refused is unacceptable. It gets interesting if there really is a voice actor that sounds just like her, but now that openai ceased using that voice, the chances of seeing that play out in court are slimmer. |
ChatGPT does not help people find your content on your site. It takes your content and plays it back to people who might have been interested in your site, keeping them on its site. This is the opposite of search, the opposite of helping.
And robots.txt is a way of allowing/disallowing search indexing, not stealing all the content from the site. I agree that something like robots.txt would be useful, but consenting to search indexing is a long, long way from consenting to AI plagiarism.