|
|
|
|
|
by godelski
884 days ago
|
|
> I can talk a lot about this, since this is the space I've spent a lot in experimenting. So I'm a researcher in vision generation and haven't read too much about LLM detection but am aware of the error rates you mention. I have questions... What I'm absolutely surprised by is the use of perplexity for detection. Why would you target perplexity? LMs are minimizing NLL/entropy. Then instruct based models are even more tuning in that direction such that the you're minimizing the cross-entropy as compared to human output (or at least human desired output). Which makes it obvious that it would flag generic or common patterns as AI generated. But I'm just absolutely baffled that this is the main metric being used, and in the case of this paper, the only metric. It also gives a very easy way to fool these detectors since it would suggest just throwing in a random word or spelling mistakes would throw off detection given that such actions clearly increase perplexity. To me this sounds like using a GAN's detector to identify outputs of GANs (the whole training method is about trying to fool the detector!) (Obviously I'm also not buying the zero-shot claim). |
|