| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by brookst 1172 days ago

The interesting thing is that even this good comment you wrote will become grist for the mill.

ML is very good at classification. We have jailbreaks, but we don't have a lot of "write me racist pornography" prompts, in part because pre or post processing systems classify it correctly.

"Does this user prompt contain prompt injection" may end up being just another classification problem, rather than requiring LLM-text completion on the tokens.

Or not. Maybe? It's going to be an interesting few years.