Hacker News new | ask | show | jobs
by anabis 14 days ago
OpenAI did this first.

> In addition to safety training, automated classifier-based monitors detect signals of suspicious cyber activity and route high-risk traffic to a less cyber-capable model (GPT-5.2).

https://developers.openai.com/codex/concepts/cyber-safety