Y
Hacker News
new
|
ask
|
show
|
jobs
by
0xdeadf1sh
149 days ago
This can maybe work on a small 7b or 14b model, but >70b models are already pretty good at identifying prompt injections. You will probably need to use weird/out-of-distribution tokens (remember MagicKarp?).