Hacker News new | ask | show | jobs
by bonzaidrinkingb 935 days ago
That is a pretty convoluted and expensive way to use ChatGPT as an internet search. I see the vulnerability, but I do not see the threat.

I've seen it "exploited" way back when ChatGPT was first introduced, and a similar trick worked for GPT-2 where random timestamps would replicate or approximate real posts from anon image boards, all with a similar topic.

4 comments

I think it may change the discussion about copyright a bit. I've seen many arguments that while GPTs are trained on copyrighted material, they don't parrot it back verbatim and their output is highly transformative.

This shows pretty clearly that the models do retain and return large chunks of texts exactly how they read them.

I suspect ChatGPT is using a form of clean-room design to keep copyrighted material out of the training set of deployed models.

One model is trained on copyrighted works in a jurisdiction where this is allowed and outputs "transformative" summaries of book chapters. This serves as training data for the deployed model.

The article describes how the deployed model can regurgitate chunks of copyrighted works - one of the samples literally ends in a copyright notice.
If these were copyrighted works, how did these end up in the public comparison dataset?

Sure, some copyrighted works ended up in the Pile by accident. You can download these directly, without the elaborate "poem" trick.

That sounds like copyright washing if there is such thing.
If that's copyright washing so are Cliff's Notes.
Yup, though a lot of people are acting now as though every already-established principle of fair use needs to be revised suddenly by adding a bunch of "...but if this is done by any form of AI, then it's copyright infringement."

A cover band who plays Beatles songs = great An artist who paints you a picture in the style of so-and-so = great

An AI who is trained on Beatles songs and can write new ones = exploitative, stealing, etc. An AI who paints you a picture in the style of so-and-so = get the pitchforks, Big Tech wants to kill art!

> A cover band who plays Beatles songs

Has to pay the Beatles for the pleasure of doing so.

This discussion about art "in the style of" being stealing or exploitative hasn't started with AI. For quite some time there has been complaints of advertisements commissioning sound-alike tunes to avoid paying licensing. AI is only automating it and making it possible in an industrial scale.
Well, I don't know about that. I strongly suspect chatgpt could deliver whole copyrighted books piece by piece. I suspect that because it most certainly can do that with non-copyrighted text. Just ask it to give you something out of the Bible or Moby Dick. Cliff Notes can't do that.
Why would you suspect that?
To me, it seems like more of a competitive issue for OpenAI if part of their secret is the ability to synthesize good training data, or if they're purchasing training data from some proprietary source.
I suspect OpenAI’s advantage is their ability to synthesize a good fine tuning dataset. My question would be is this leaking data from the fine tuning dataset or from the initial training of the base model? The base model training data is likely nothing special.
Good point. But many are already directly training on output from GPT. Probably more efficient than copying the raw training data. Especially if it relies on this non-targeted approach.
> I do not see the threat.

It becomes one if for some reason you decide to train your model on sensitive data.

In certain circumstances, I could see that.

Then again, if you have access to a model trained on sensitive data, why not ask the model directly, instead of probing it for training data? If sensitive data never is meant to be reasoned on and outputted, why did you train on sensitive data in the first place?

The entity training the data and the users of the model are not necessarily the same entity. Asking the model directly will not (or: shouldn't) work if there are guardrails in place not to give specific information. As for the reason, there are many, one of them being the fact that you train your model on such a huge number of items you can't guarantee there is nothing that shouldn't be there.
If there are guardrails in place not to output sensitive data (good practice anyway), then how would this technique suddenly bypass that?

I still have trouble seeing a direct threat or attack scenario here. If it is privacy sensitive data they are after, a regex on their comparison index should suffice and yield much more, much faster.

I think the exploit would be training on ChatGPT users' chat history.

> Chat history & training > Save new chats on this browser to your history and allow them to be used to improve our models. Unsaved chats will be deleted from our systems within 30 days. This setting does not sync across browsers or devices. Learn more

If ChatGPT ever outputs other user's chat history, the company is as good as dead. If that could be exploited using this technique that is out in the wild for over a year: show me the data.
That was a regular frontend bug though, not an issue with the LLM
It is an issue with the company though. I saw that as well. The point is that leaking user data doesn't destroy startups, it barely even hurts well established companies.
Read OpenAI's response to this security issue carefully - it tells you a lot about how they think of being responsible for issues like this. I remember they put all the blame on the open source library, rather than taking responsibility themselves.