Y
Hacker News
new
|
ask
|
show
|
jobs
by
0xkvyb
55 days ago
Still crazy how easy it is to "jailbreak" even SOTA LLMs with a simple assistantResponse replacement in chat thread.
1 comments
dotancohen
55 days ago
Tell us more.
link
_3u10
55 days ago
I think what he is saying is they are stateless so you can edit its previous repsonses and it just goes with it.
link
vorticalbox
55 days ago
If you build a small ui that lets you edit the models response too it’s pretty funny to do.
It sees that it “said” it and gets very confused.
link
hilariously
55 days ago
I have seen it where you can just report it said it and it will be confused.
link