Hacker News new | ask | show | jobs
by rdrdg 1548 days ago
Thanks for your questions.

- The input to our models are image, video, and audio. Based on the model, we can use parts of the image (esp faces) or whole image. Yes, we also incorporate metadata for better detection.

- It's a fair concern. As quality of generative media increases, so does the sophistication of detection. Since, we fully understand how generative media is created, it gives our the leverage to reverse engg. Much like the anti-virus industry (wrt scanning), we'd need to be at the forefront of not only detection, but generation methods, re-learn models based on new generation methods, etc.

1 comments

Thanks for the response, I'll look more into this. So what you're saying is we need to understand the methods to devise countermeasures to such methods, instead of being model-agnostic. That sounds like a grueling task and truly an arms race. Best of luck to the team!