The particular videos are designed by this approach (GANs), and thus the generator is highly adapted to generate data that can't be distinguished from a real video with these techniques.
If it's built with state-of-art GANs, then you don't be able to reliably detect it with discriminators from these state-of-art GANs. If you manage to build a better automatic discriminator than the authors did, then they could directly use it in the adversarial approach to build a better generator, and your detection approach would again cease functioning.
Remember that in the end there's a fundamental asymmetry - in theory, there could be a perfect generator (not that we're close to perfect yet) that generates undistinguishable samples, and thus there can not be a perfect discriminator.
If it's built with state-of-art GANs, then you don't be able to reliably detect it with discriminators from these state-of-art GANs. If you manage to build a better automatic discriminator than the authors did, then they could directly use it in the adversarial approach to build a better generator, and your detection approach would again cease functioning.
Remember that in the end there's a fundamental asymmetry - in theory, there could be a perfect generator (not that we're close to perfect yet) that generates undistinguishable samples, and thus there can not be a perfect discriminator.