Hacker News new | ask | show | jobs
by rnosov 1212 days ago
Answering to myself as I can't edit my response any more. I've investigated a bit more and the attack as described by Greshake[1] seems to be much more realistic to me than I initially thought.

[1] https://github.com/greshake/llm-security

1 comments

Yea that's me. It seems to be very difficult right now to get people's attention to this and make them take it seriously. On a side note, your project is also currently putting unfiltered model output straight into osascript sooooo a lot of the fancy gymnastics needed to make stuff work in the paper with only search abilities isn't required in this case.
Just for the avoidance of doubt "AI Files" is not my project and I'm not affiliated with the author in any way.

osascipt line does look a bit dodgy to me but perhaps is safe. But I can see how things might go downhill quickly with this approach...