Hacker News new | ask | show | jobs
by ideonexus 849 days ago
The copyright law portion of the lawsuit is interesting and I'm curious about how that will go, but the NYT has a second argument that every article I read completely ignores: ChatGPT routinely attributes falsehoods to to the NYTs. It's a problem I've had with AI since the beginning, you have to fact-check everything it tells you because it will confidently make up references and facts all the time. It's one thing for ChatGPT to quote a NYT's article verbatim, it's another thing for it to completely make up stories and then attribute them to the NYT. Balancing copyright and fair use is an interesting debate, but when your AI "hallucinates" a completely fabricated article and attributes it to your organization, that's damaging.
3 comments

I agree with you. It's hard to see how OpenAI wins the trademark portion of this NYT lawsuit. There is no fair use clause to trademark law that covers attributing hallucinations to a trademarked entity.

https://en.wikipedia.org/wiki/Fair_use_(U.S._trademark_law)

LLMs likely don't need proprietary data to train effectively. However, as long as the training data includes references to the NYT, misattribution issues may arise.

We certainly need measures to prevent defamation by LLMs, or any text generators, and their creators. It's challenging to determine where to draw the line—from decryption tools that decipher random bits, to web browsers displaying text, to simple text editors, to n-gram Markov chain text generators, to shallow RNNs, to GPT-1, and beyond. Should we hold the tool creators or the tool users accountable for misuse?

In my view, the worst outcome of the NYT winning the lawsuit wouldn't be OpenAI halting progress in generative text tools. The real concern is that OpenAI, with its resources, might find technological solutions to these issues, while startups and hobbyists with limited resources could be forced to stop operating entirely.

>In my view, the worst outcome of the NYT winning the lawsuit wouldn't be OpenAI halting progress in generative text tools.

That's the best outcome in many more views than yours.

I have only read the first quarter of William Faulkner's The Sound and the Fury; this is notoriously a "difficult book" to understand, particularly given that it takes an unreliable first-person POV, is anachronastic, and seems to be narrated by a mental invalid (and lacks normal punctuation, particularly quotemarks).

----

...so I asked ChatGPT to help me understand the first chapter (80 pages). The chapter ends with the narrator being called "Maury," so I asked AI "is TS&tFury narrated by Maury?" It responded "no, it's Benjy" (which was initially more confusing than just reading Faulkner).

But upon further questioning (without actually knowing, for certain, as reader), it turns out that it does arrive at [I presume correctly..?] the correct response, which is that Benjy IS Maury.

----

So while it was overall helpful, it took coaxing from a not-even-done-with book avid reader. I took the AI's last piece of advice, which was to purchase a human-authored companion reader for TS&tFury =P