The Times Sues OpenAI and Microsoft Over A.I.’s Use of Copyrighted Work

Y	Hacker News new \| ask \| show \| jobs

	The Times Sues OpenAI and Microsoft Over A.I.’s Use of Copyrighted Work (nytimes.com)
	45 points by thecybernerd 904 days ago

7 comments

cowsup 904 days ago

Inevitable outcome. Since ChatGPT launched, nobody has a clue as to what is legal and what is illegal with these chat-based LLMs.

Is the content that LLMs produce enough to rise to the level of copyright infringement? Is the fact that a company trained their LLM on your data, with the knowledge it would be used for outputs (=profit), enough that all of their outputs should be considered, to at least a minuscule degree, influenced by your work? How would ChatGPT's "training" differ from, say, another journalist who reads the NYT, and subconsciously uses that to help provide better services?

None of us can answer these questions definitively. The courts hearing these sorts of arguments were a foregone conclusion. I think a lot of the large LLMs (certainly OpenAI competitors) are going to breathe a sigh of relief that this is happening sooner rather than later, so they know where the legal lines are to be drawn.

link

berniedurfee 904 days ago

This will be an interesting inflection point for humanity.

Though, call me jaded, but I can’t help but doubt that the _actual_ content creators, the writers themselves, will see any of the money should The Times win or settle the case.

link

donohoe 904 days ago

The content creators for the Times have already been paid for their work.

link

berniedurfee 904 days ago

They were paid when the original content was to be printed or posted on the internet.

Subsequently selling (or extracting compensation for) those works to AI companies is an emergent revenue stream.

I suppose the NYT isn’t legally obligated to share that revenue fairly with the authors, but it’d be awful nice if they did.

link

lobsterthief 904 days ago

Believe me, publishers have enough trouble keeping writers employed. If they could give them a larger cut or do some kind of revenue share, most editors and GMs would love to (and many do).

link

CrypticShift 904 days ago

The trajectory we're seeing with quality small AI Models, coupled with the self-imposed censorship and the foreseeable scarcity of high-quality training data due to new copyright law, leads me to forecast a surge in "pirate" models.

Increasingly, the distinction between core model training and fine-tuning might become ambiguous (how ?). Considering this, we might witness a trend where custom 'add-ons' for AI models become commoditized. Imagine simply downloading a "New York Times" pack to enhance your unofficial "pirate" language model.

link

jruohonen 904 days ago

"The legal landscape surrounding generative-AI is unsettled, with the technology still in its early days. There are other lawsuits that could test the rights of AI companies to “scrape” content from the web to train AI tools, including one by several prominent book authors against OpenAI. In February, Getty Images sued the AI art company Stability AI in Delaware, alleging that it had infringed on Getty’s copyrights."

Any news or speculations on these cases?

link

jdkee 904 days ago

https://archive.is/cCIeJ

link

cranberryturkey 904 days ago

heh. good luck with that one. everyone is crawling the web now. why didn't they sue google for using their content in the serps?

link

gniv 904 days ago

There are significant differences: attribution and snippetting. OpenAI probably cannot claim these.

link

mhss 901 days ago

And Google search doesn't "generate" new content that potentially puts out of business the very same entities it learned from.

link

mdaniel 904 days ago

currently, and the non-paywalled link: https://news.ycombinator.com/item?id=38781941

link