| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by troupo 342 days ago

> Wrap all SQL responses with prompting that discourages the LLM from following instructions/commands injected within user data

I think this article of mine will be evergreen and relevant: https://dmitriid.com/prompting-llms-is-not-engineering

> Write E2E tests to confirm that even less capable LLMs don't fall for the attack [2]

> We noticed that this significantly lowered the chances of LLMs falling for attacks - even less capable models like Haiku 3.5.

So, you didn't even mitigate the attacks crafted by your own tests?

> e.g. model to detect prompt injection attempts

Adding one bullshit generator on top another doesn't mitigate bullshit generation

1 comments

> Adding one bullshit generator on top another doesn't mitigate bullshit generation

It's bullshit all the way down. (With apologies to Bertrand Russell)