Hacker News new | ask | show | jobs
by brentsch 2496 days ago
I'm curious about the "fine-tuning based detection" mentioned in the report ("Fine-tunes a language model to 'detect itself'... over a range of available settings"). Does anyone know good articles/papers (or have an off-the-top tl;dr) to get a high-level grasp of "self-detection" for generative models?
1 comments

Hiya, I work at OpenAI. I think the Grover paper is a good place to read about some of this:https://arxiv.org/abs/1905.12616 We're likely publishing more on detecting fine-tuned outputs in the future, also.
Many thanks! Looking forward to reading the OpenAI research when it comes out as well.