Hacker News new | ask | show | jobs
by bourgoin 1177 days ago
Well, that's N=1. But we have seen that it's sometimes possible to bypass that kind of filter with clever prompt engineering. And because these things are black boxes, it doesn't seem possible to rigorously prove "unjailbreakability"