Hacker News new | ask | show | jobs
by superfrank 551 days ago
I know there are people acting like this is obvious that this is AI, but I get why people wouldn't catch it, even if they know that AI is capable of creating a video like this.

A) Most of the give aways are pretty subtle and not what viewers are focused on. Sure, if you look closely the fur blends in with the pavement in some places, but I'm not going to spend 5 minutes investigating every video I see for hints of AI.

B) Even if I did notice something like that, I'm much more likely to write it off as a video filter glitch, a weird video perspective, or just low quality video. For example, when they show the inside of the car, the vertical handrails seem to bend in a weird way as the train moves, but I've seen similar things from real videos with wide angle lenses. Similar thoughts on one of the bystander's faces going blurry.

I think we just have to get people comfortable with the idea that you shouldn't trust a single unknown entity as the source or truth on things because everything can be faked. For insignificant things like this it doesn't matter, but for big things you need multiple independent sources. That's definitely an uphill battle and who knows if we can do it, but that's the only way we're going to get out the other side of this in one piece.

8 comments

I agree. Also, tangentially related: I use a black and white filter on my phone, and it is way harder to distinguish fake and real media without the color channels to help. I couldn't immediately find anything in the subway clip which gave it away.
I've definitely seen skin blurring filters that everyone already uses to make it really hard to know
Hijacking this top comment to say that I found the AI video creator: https://www.instagram.com/bugugugugu_aigc/
I agree. Apart from the text appearing backwards it all looked pretty real to me.
My assumption was the uploader wanted to make the creator's "AIGC" less obvious. It definitely did that to me.
Yeah, that's a weird one. I doubt the video was generated that way. I assume someone flipped the video for "artistic" purposes.
Reversing text is a known loophole to getting around copyright guardrails in image-generation models.
How does that work? Would you prompt the model to write "hello Kitty but in reverse" on the train so the resulting image isn't flagged?
Much more likely they just flipped the video in an editor after it was generated. Its common enough to see flipped video with backwards text on social media, most people wouldn't give it a second thought.
I'm beginning to write off most images as AI. I actually think that's where this is all headed.
There are projects like https://contentcredentials.org/ . If we want, with some effort we could distinguish between real and ai generated. If.
No individual actor - human or corporate - stands to benefit enough because "trust in reality" is neither easily measured nor financialized.
Some do care, e.g. some camera manufacturers or some news agencies. Surprisingly some social media platforms[1] want clear labels for AI generated content.

[1]: e.g. tiktok https://newsroom.tiktok.com/en-us/partnering-with-our-indust...

that's the easiest position imo. It's AI unless proven otherwise. No one has the time to place this much detailed on a random video when the purpose of the video is just entertainment. What this might lead to though is people losing (or not learning) the skills needed to separate real content from AI generated content
And even if it isn't AI, it is quite possibly deceptively edited. Content provenance will be important in the future.
A precondition is likely that one has mainly watched CGI-heavy movies for most of one's life. Compared to old school analog movies or fairly raw photography that looks as fake as the Coca-Cola Santa. There's a rather obvious lack of detail that real photography would have catched.
> A precondition is likely that one has mainly watched CGI-heavy movies for most of one's life.

Indeed, a great (if counterintuitive) example of this is The Wolf of Wall Street. I bet a lot of people would be surprised at just how much CGI is used in that just for set/location.

The OG film for that was Forrest Gump. It is often lauded as one of the first movies to use CGI heavily but in completely, and intentionally, unnoticeable ways...
True, but in that case you knew it had to be CGI because Kennedy didn't talk to Tom Hanks in any capacity.
Sure, it's like a weird dream where sometimes shadows don't come from the sun and the scenery has this absurd, acutely unreal polish.
A) also true that many people don't put a lot of thought into very much at all. They'd never consider actively thinking if a video is fake or not. These are the targets of short form content.
B is / will be huge; the largest amount of "mindless" content is consumed on phones, with half attention, often with other distractions going on and in between doing other stuff, and can be watched on older / lower fidelity devices, slower internet connections, etc. AI content needs high resolution / big screens and focused attention to "discover".

The truth is... most people will simply not care. Raised eyebrow, hm, cute, next. Critical watching is reserved for critics like the crowd on HN and the like, but they represent only a small percentage of the target audience and revenue stream.

You can see the perspective/angle of the objects changing slightly as the camera moves in a way that makes it pretty obvious they're CG, AI or otherwise. That's always been a problem with AI generated imagery in video/animation; it changes too much frame to frame. If researchers figure out how to address that, yeah, we've got a problem. Until then - this looks worse tha

Then there's the usual giveways for CG - sharpness, noise, lighting, color temperature, saturation - none of them match. There's also no diffuse reflection of the intense pink color.

Yes. The lack of diffuse reflection from the pink train is the clearest giveaway, and AI videos in general have problems with getting shadows and radiosity right. There's also the existence of the real-world Hello Kitty Shinkansen and the APM Cat Bus in Japan that makes this image more plausible.
That last point is also important; if it's not surprising, people will just accept it without being too critical about it. And since these AI tools are trained with real / existing content, creating realistic-enough content will be the norm. I think the first big AI generators - dall-e and co - had their model trained on more fantastical / artistic sources, and used that primarily as their model, also because realistic generation (like humans) wasn't yet good enough, or too uncanny. But uncanny and art work well together.
Also consider one of of the reasons AI generated video has CG like artifacts is because it is trained on CG video. Better CG generation, and more real video for training will reduce these over time.
Honestly, stuff like that could also be because of compression. We're all used to see low quality videos online.