| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by beeboobaa3 770 days ago
	> OpenAI has presumably already ingested and trained on these publishers’ archival data So they're admitting to copyright violations and theft?

2 comments

mdavidn 770 days ago

Whether training a model on text constitutes copyright infringement is an unresolved legal question. The closest precedent would be search engines using automated processes to build an index and links, which is generally not seen as infringing (in the US).

link

beeboobaa3 770 days ago

https://www.rvo.nl/onderwerpen/octrooien-ofwel-patenten/vorm...

link

stale2002 770 days ago

No, they have not done that. Presumably they believe that the model training was done in fair use and no court has said otherwise yet.

It will take years for that stuff to settle out in court, and by that time none of that will matter, and the winners of the AI race will be those who didn't wait for this question to be settled.

link

beeboobaa3 770 days ago

They believe a lot of things, I'm sure.

> and the winners of the AI race will be those who didn't wait for this question to be settled.

Hopefully they'll be in jail.

link

stale2002 769 days ago

Its not just the big companies you have to think about, lol.

Sure you can sue OpenAI.

But will you be able to sue every single AI startup that happens to be working on Open Source AI tech, that was all trained this way? Absolutely not. Its simply not feasible. The cat is out of the bag.

link

beeboobaa3 769 days ago

The US government has worked hard to make the lives of copyright infringers miserable for years, even driving them to suicide.

link

stale2002 769 days ago

> The US government has worked hard to make the lives of copyright infringers miserable for years

They really have not. The fact that I can download any movie in the world right now, and use all of the open source models on my home PC proves that.

I am sure there are some random one off cases of infringers being punished, but it mostly doesn't happen.

Especially if we are talking about the entire tech industry.

The government isn't going to shutdown every single tech startup in the US. Because they are all using these open source AI models.

The government isn't going to be able to confiscate everyone's gamer PCs. The weights can already be run locally.

link

beeboobaa3 769 days ago

https://en.wikipedia.org/wiki/Aaron_Swartz

https://en.wikipedia.org/wiki/Illegal_number

link