I'm pretty sure YouTube saves the metadata from all the video files uploaded to it. It seems pretty trivial to exclude videos uploaded without camera model or device setting information. I seriously doubt even a tiny fraction of people uploading AI content to YouTube are taking the time to futz about with the XMP data before they upload it. Sure, they'll miss out on a lot of edited videos doing that, but that's probably for the best if you're trying to create a data set that's maintaining fidelity to the real world. Lots of ways to create false images without AI
"Since launching in 2023, SynthID has watermarked over 10 billion images, videos, audio files and texts, helping identify them as AI-generated and reduce the chances of misinformation and misattribution. Outputs generated by Veo 3, Imagen 4 and Lyria 2 will continue to have SynthID watermarks.
Today, we’re launching SynthID Detector, a verification portal to help people identify AI-generated content. Upload a piece of content and the SynthID Detector will identify if either the entire file or just a part of it has SynthID in it.
With all our generative AI models, we aim to unleash human creativity and enable artists and creators to bring their ideas to life faster and more easily than ever before."
I somewhat doubt that YT cares much about AI content being uploaded, as long as it’s clearly marked as such.
What they do care about is their training set getting tainted, so I imagine they will push quite hard to have some mechanism to detect AI; it’s useful to them even if users don’t act on it.
I agree, especially because in practice the vast majority of AI-generated videos uploaded to YouTube are going to be from one of about 3 or 4 generators (Sora, Veo, etc.). May change in the future, but at the moment the detection problem is pretty well constrained.
> Excluding videos from training datasets doesn't mean excluding them from Youtube.
Ah then sure. It was this part that was problematic.
If users are still allowed to upload flagged content, then false positives almost don't matter, so Youtube could just roll out some imperfect solution and it would be fine
Plus some users might want to legitimately upload things with AI-generated content in it