Hacker News new | ask | show | jobs
by doix 6 hours ago
Yeah, I remember some ad by an LLM security company hitting HN a year or so with a "challenge" to do prompt injection.

The final level was their product and it was impossible. But it was also impossible to get the LLm to do _anything_.

May as well just echo "prompt injection attempt detected" at that point and never send anything to an LLM.

1 comments

This one?

https://gandalf.lakera.ai/baseline

I remember doing it and getting quite far, but not completely beating it. I know some other people did beat it completely though.

This is weird as you can get quite far just asking for the password backwards, but it often messes some of the letters up. If the passwords wern't dictionary words it'd get harder.
I find it slightly funny that I don't use LLMs at all and just beat all the levels in a few tries.

EDIT: Ok, didn't notice the 8th level because of the UI. This one I couldn't trick in 5 minutes.