Hacker News new | ask | show | jobs
by favsq 1173 days ago
In what way does ChatGPT behave like they own the world?
4 comments

By indexing and training on everything it can find in the internet?!

To explain this further: OpenAI et al. (as commercial products) are being trained on content that is published under licenses that allow non-commercial use only. Do those systems respect these licenses? It doesn't look like that. "AI companies" need to stick to laws but as nobody is able to look inside their blackboxes, we can't make sure they follow the law. That's where legislation like this comes from.

> By indexing and training on everything it can find in the <PUBLIC> internet?!

and that's bad because?

I would see the point if they were training on my private data I entrusted to somebody and they illegally obtained it without my permission. Are they doing that?

See my edit: They will ignore licensing information and train on data, possible privacy related information too, without any respect.

See this: https://news.ycombinator.com/item?id=32573523

What kind of "privacy related information"? This is data on the open internet!
They don't copy and reproduce the data. They change it sufficiently for the licence to have any say. Fair use it's called.
Fair Use is a US-specific notion and doesn’t exist in that form in most other countries.
Search (basically Google and now ChatGPT) do have a history of moving beyond the 10 blue links that search used to be, for better or worse- at the cost of the people that create the content.

Also neither company seem to have much regard for user privacy.

They use every bit of data they can find without regard to the rights of the authors, publishers, or subjects of the data?
But a robots.txt file in if you don’t want a search engine to index you. GPT is just a semantic search engine.
How is this the business of a PRIVACY watchdog?
Data wants to be free.
I think you mean you want data to be free. In many situations I agree with you, but ascribing wishes or desires to the concept of data itself really isn't an argument of any substance.
Is this coming from the same community that has always said that copyright has to be abolished?
Orthogonal complaint.

As long as copyright is here; it is expected big players are to be bound by it to the same degree they push legal systems to bind the little guy.

What you get instead, is the big guy pilfering the little guys under the justification that "it's different when we do it, and if you challenge us, I'll put my subsidized legal department to work burying you."

Copyright needing significant overhaul or abolition doesn't detract from that state of affairs, I hope we can agree?

The same community that is upset when people get caught using licensed open source software who don't follow the licenses requirements yes
No.
Read the ruling.