| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by skytrue 1261 days ago

Disclaimer: I'm working on an app to solve the impending wave of generative content, so I'm somewhat biased.

As I'm sure many of us did, I tested out the app here on GPT-3 output. Unmodified, it detected it was GPT-3. Great! However, I added about 10 additional words to the output provided by GPT-3, and it shot up my "human" score by like 60 points, and determined it was human-generated.

This is going to be the problem underscoring _any_ model that is trying to identify "AI" generated text. A human can modify it slightly, or subtract words, and it throws the entire thing off. There are other paths that we need to explore to this problem.

3 comments

godelski 1261 days ago

As a generative modeling researcher, thank you for the new adversarial training methods.

In all seriousness though, this is really a cat and mouse game. Working in CV I might be biased to think that images are going to be easier to detect than text, but we are going to get better. I'm not saying we shouldn't create detectors (we should), but I think we also need to be aware that this is a cat and mouse game and we need to have true social conversations about this. Though we still fail to have these conversations when it comes to computer security so maybe we're doomed (or maybe this will be the catalyst). It's also important to note that a lot of damage can be done even with images and text that are easy to detect as fake. I've seen plenty of fake Twitter and Linkedin accounts that use StyleGAN profile pictures (with all the telltale signs).

I'm curious what kind of model you're using? How interpretable is it?

link

AYBABTME 1261 days ago

If we go back to the fear of AIs supplanting humans, Accelerando style, and the desire to keep humans relevant. Then assume that we accept the premise that humans-augmented-by-AI work is an acceptable outcome for that future, in an attempt to keep humans relevant.

Then is it a problem if a human slightly modified an AI's work? Isn't that the desirable outcome? And if the work itself is so useless that it would be worthy of a zero grade, then perhaps another way of measuring usefulness should be used.

In a way, I feel like we can't impose old-world grading techniques to new-world content synthesis technologies.

link

cloudking 1261 days ago

What specific problem(s) are you trying to solve?

link