Hacker News new | ask | show | jobs
by jasonlfunk 655 days ago
Can someone help me understand why it's a problem for companies to train these huge LLM on your copyrighted material? What exactly is the harm that is being done to the copyright holder?

I can understand why the New York Times (for example) wants to claim that a couple billion dollar companies have done it actual harm; but I am struggling to actually identify what it is.

2 comments

>The complaint cites several examples when a chatbot provided users with near-verbatim excerpts from Times articles that would otherwise require a paid subscription to view. It asserts that OpenAI and Microsoft placed particular emphasis on the use of Times journalism in training their A.I. programs because of the perceived reliability and accuracy of the material.

>In one example of how A.I. systems use The Times’s material, the suit showed that Browse With Bing, a Microsoft search feature powered by ChatGPT, reproduced almost verbatim results from Wirecutter, The Times’s product review site. The text results from Bing, however, did not link to the Wirecutter article, and they stripped away the referral links in the text that Wirecutter uses to generate commissions from sales based on its recommendations.

>The lawsuit also highlights the potential damage to The Times’s brand through so-called A.I. “hallucinations,” a phenomenon in which chatbots insert false information that is then wrongly attributed to a source. The complaint cites several cases in which Microsoft’s Bing Chat provided incorrect information that was said to have come from The Times, including results for “the 15 most heart-healthy foods,” 12 of which were not mentioned in an article by the paper.

https://www.nytimes.com/2023/12/27/business/media/new-york-t...

This is a pretty good discussion of some of the other issues: https://hls.harvard.edu/today/does-chatgpt-violate-new-york-...

Somewhere else in this thread, an example of given. An LLM is trained using all of Frank Miller's copywritted material (he makes comics books). A user then comes along to the trained LLM and says make a comic book that looks like Frank Miller's comic books, and the user then sell the newly created comic book for profit. Should Frank Miller not get something?
Though that is different from saying Frank Miller was harmed. I guess if his sales dropped because people were buying GPT stuff instead that would be the case.