| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by 0xkvyb 55 days ago
	Still crazy how easy it is to "jailbreak" even SOTA LLMs with a simple assistantResponse replacement in chat thread.

1 comments

Tell us more.

I think what he is saying is they are stateless so you can edit its previous repsonses and it just goes with it.

If you build a small ui that lets you edit the models response too it’s pretty funny to do.

It sees that it “said” it and gets very confused.

I have seen it where you can just report it said it and it will be confused.