Hacker News new | ask | show | jobs
by minkzilla 1865 days ago
Or you just train a machine to do it and then generate a bunch and have this second machine sort out any it thinks are machine generated.
2 comments

The best fake-detecting model detecting fakes generated by the best generator model will always lag behind the latter model.
I think I see what you’re saying, but why is this so?
In essence detecting which one is fake is a common way how you train the generator, tweaking the generating process to "fix" any detectable flaw; and you train it until (as far as your system is concerned) the generated texts are indistinguishable from the real ones. A better system might distinguish them, but that better system can be relatively trivially adapted to generate better texts which it won't be able to distinguish from real ones.
You basically just described a GAN. Neat!
GANs work by feeding back the mistakes and forcing the generator model to improve its cheating. In this case, filtering out titles that are ambiguous would act as an independent filter.
It’s not an exact description of a GAN, but then I never said it was either.