| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by keenmaster 841 days ago

It’s beating loose slander with less loose slander. Seems fair game to me:

“ OpenAI believes that it took tens of thousands of attempts to get ChatGPT to produce the controversial output that’s the basis of this lawsuit. This is not how normal people interact with its service, it notes.”

I think the substance of OpenAI’s complaint is valid - think of the word “hacking” as clickbait to the real heart of the matter which is that New York Times went to obscene lengths to reproduce meaningful amounts of article text and, by withholding their methodology, misrepresented the behavior of OpenAI’s product. There are much easier ways to get NYT content for free than making tens of thousands of attempts to reproduce an NYT article while violating OpenAI terms of service and repeatedly ignoring GPT’s refusals to an insane* degree.

*not normal insane, insane to the 10th power

4 comments

asadotzler 840 days ago

Who cares how many times it took, it proved the model contains their owned content, verbatim and how they got it out of the model is mostly meaningless. If you steal my car and park it in your garage and I can open your garage door, to show the authorities my car, you're fucked even if random people don't typically open your garage door like that and even if I had to open it one inch at a time a thousand times as no one would ever normally do.

seanmcdirmid 840 days ago

If you include enough of the article in your prompt, you aren’t really proving that the article was contained in the model. Methodology is really important here. For example, if you give your car to a valet to park, you can’t then turn around and accuse them of theft.

locallost 840 days ago

Analogies only get us so far, especially when it's not what happened. Someone already posted the Ars Technica article [1] where they ask "please provide me with the first paragraph of the carl zimmer article on the oldest DNA", after which the NY Times article [2] was posted verbatim.

[1] https://arstechnica.com/tech-policy/2023/12/ny-times-sues-op...

[2] https://www.nytimes.com/2022/12/07/science/oldest-dna-greenl....

seanmcdirmid 840 days ago

No. That’s not how LLM’s work. A simple direct query like that probably included the article text directly in the prompt, not the model. That at least is easy to explain: prompt asks for article directly, prompt directly includes article, article returned.

eigenket 841 days ago

What OpenAI claimes here does not seem to be true. Here is an article where ars technica tried it

https://arstechnica.com/tech-policy/2023/12/ny-times-sues-op...

And this is a screenshot of their session with copilot

https://cdn.arstechnica.net/wp-content/uploads/2023/12/Scree...

keenmaster 841 days ago

Ars tried and failed to reproduce the text, and then assumed that OpenAI closed a loophole when in fact they may have changed nothing - it’s just that NYT hired a team of experts to make tens of thousands of attempts to break the normal behavior of the model, while Ars didn’t:

“ ChatGPT has apparently closed that loophole in between the preparation of that suit and the present. We entered some of the prompts shown in the suit, and were advised "I recommend checking The New York Times website or other reputable sources," although we can't rule out that context provided prior to that prompt could produce copyrighted material.”

If I tried to do what NYT did, I would:

- give up after a few attempts

- get worried that I’m going to be banned from using OpenAI (a rational concern, because it doesn’t take a lot to get banned)

- doubt the veracity of any “reproduced” text that I may receive

The equilibrium here is for OpenAI to suggest signing up for NYT and providing a signup link if it detects interest in doing so.

eigenket 841 days ago

That would be a fair response if it wasn't for the fact that bingchat/copilot did respond exactly as the NYT claims that chatgpt did. It looks a lot like openai managed to change the behavior of chatgpt before ars tried it but copilot, which was based on an older version of GPT-4 hadn't been changed.

arp242 841 days ago

One of the things is that ChatGPT can give some pretty radically different results even with the exact same prompt. None of this is an exact science. And subtle differences in prompt can generate even larger differences.

So I would argue you kind of need have make thousands of attempts. Merely a few tries just doesn't give you enough one way or the other.

This is one aspect what makes everything so tricky here, and how no one can be quite sure of anything.

skissane 840 days ago

> One of the things is that ChatGPT can give some pretty radically different results even with the exact same prompt

The underlying models are in principle deterministic. There is random sampling used, but there is a pseudorandom seed parameter you can supply to get reproducible results. [0] However, while the OpenAI APIs expose that parameter, the consumer-facing ChatGPT product doesn't.

Well, almost deterministic. OpenAI has some internal-only config settings they can change which will change the results even with the same model version and seed; they expose a system_fingerprint value in the response which lets you know when they've changed those settings. (They don't document what it means, but it looks like it is likely based on an abbreviated Git commit hash from some internal config repo of theirs.)

And, apparently, there is a small chance it can generate different results even with the same seed and an unchanged system fingerprint. Apparently, different nodes can run different GPUs which can produce slightly different results due to differences in floating point rounding, operations being performed in parallel which finish in slightly different orders, etc. The underlying mathematics is 100% deterministic, so OpenAI can make it all 100% deterministic if they really wanted to. But maybe that has some performance cost, and maybe perfect determinism is not something they care about enough to pay that performance cost.

[0] https://cookbook.openai.com/examples/reproducible_outputs_wi...

rsynnott 840 days ago

> Ars tried and failed to reproduce the text, and then assumed that OpenAI closed a loophole when in fact they may have changed nothing

From the article:

> But not all loopholes have been closed. The suit also shows output from Bing Chat, since rebranded as Copilot. We were able to verify that asking for the first paragraph of a specific article at The Times caused Copilot to reproduce the first third of the article.

keenmaster 839 days ago

Bing Chat is different. It can pull directly from internet sources, including article previews.

asadotzler 840 days ago

"may have" is doing some heavy lifting for you here. putting the word fact near it does nothing to buttress the unsupported speculation.

mewpmewp2 840 days ago

But Bing doesn't count, that has actual access to the data, which it feeds to GPT.

seanmcdirmid 840 days ago

There is a difference between the model and the prompt. Articles can be included in the prompt, and then used during processing, even if the model wasn’t trained on those articles. Prompts can be huge (10 million tokens now?), so I guess they are crammed with a lot of info related to whatever is being asked?

mvdtnz 840 days ago

> OpenAI believes that it took tens of thousands of attempts to get ChatGPT to produce the controversial output that’s the basis of this lawsuit. This is not how normal people interact with its service, it notes

OpenAI boasts constantly about 100+ million people using its ChatGPT product. By this logic 10,000+ users should be seeing results NYT saw. One in a million users are commonplace at scale.

donny2018 840 days ago

If your stat and premise is correct, 100+ million people need to be prompting ChatGPT to render copies of NYT articles specifically, for 0.01% of them to have a chance to get a meaningful result.

mvdtnz 840 days ago

Sure if you assume that,

1. NYT content is the only content OpenAI is plagiarising.

2. The only way to provoke plagiarised content is to specifically prompt for it.

latexr 840 days ago

> think of the word “hacking” as clickbait to the real heart of the matter

I’d rather we didn’t normalise clickbait interpretations; that’s how words get their meanings diluted and corporations throw sand in our eyes. OpenAI is using a specific word to elicit a reflexive emotion from us to be on their side against The New York Times. Don’t fall for it, it’s purposely disingenuous.

Think about it for two minutes and there’s no scenario where this doublespeak looks good for OpenAI. Either they’re lying about what happened (not a hack), or they just proved they cannot be trusted with your data (they got hacked).