Hacker News new | ask | show | jobs
by JustBreath 1148 days ago
> Monitor them for "dangerous" thought like we do with LLM

We do that?

1 comments

Not very well, given the existence of "DAN" and similar prompt hacks, but yes.