Hacker News new | ask | show | jobs
Show HN: Thought Forgery, a new technique for jailbreaking LLMs
2 points by UltraZartrex 270 days ago
Hi HN, I'm an independent security researcher and wanted to share a new vulnerability I've discovered.

My account is too new to submit the direct link, so I'm making a text post instead.

The technique is called "Thought Forgery" (CoT Injection). It works by forging the AI's internal monologue, which acts as a universal amplifier for other jailbreaks. I've confirmed it works on the latest models from Google, Anthropic, OpenAI, etc.

I'd be happy to share the link to the full technical write-up on GitHub in the comments if anyone is interested.

3 comments

This is well known
Ok I wouldn’t be able to point to where I’ve read about it, just that I know it already so I assumed it was well known
Please do post your write up this is interesting but pretty vague frankly
sure
Thank you!