|
|
|
|
|
by skytrue
1261 days ago
|
|
Disclaimer: I'm working on an app to solve the impending wave of generative content, so I'm somewhat biased. As I'm sure many of us did, I tested out the app here on GPT-3 output. Unmodified, it detected it was GPT-3. Great! However, I added about 10 additional words to the output provided by GPT-3, and it shot up my "human" score by like 60 points, and determined it was human-generated. This is going to be the problem underscoring _any_ model that is trying to identify "AI" generated text. A human can modify it slightly, or subtract words, and it throws the entire thing off. There are other paths that we need to explore to this problem. |
|
In all seriousness though, this is really a cat and mouse game. Working in CV I might be biased to think that images are going to be easier to detect than text, but we are going to get better. I'm not saying we shouldn't create detectors (we should), but I think we also need to be aware that this is a cat and mouse game and we need to have true social conversations about this. Though we still fail to have these conversations when it comes to computer security so maybe we're doomed (or maybe this will be the catalyst). It's also important to note that a lot of damage can be done even with images and text that are easy to detect as fake. I've seen plenty of fake Twitter and Linkedin accounts that use StyleGAN profile pictures (with all the telltale signs).
I'm curious what kind of model you're using? How interpretable is it?