Hacker News new | ask | show | jobs
by zuminator 563 days ago
That was also true of quite a lot of early CGI, but most people would say that things have improved. I think we're on the cusp of rapid improvement in AI video as well, in part spurred on by skilled people using the tools we currently have.

I came across the following recently. I think a casual viewer would assume it was just Bakshi-style rotoscoped animation without a major AI component.

https://www.youtube.com/watch?v=X9BG6yBkOIE

3 comments

My constructive criticism to this video is "90% of it is figures standing still while wind blows their outfit or the camera does a simple move." Sometimes moving their lips as though talking .. though I did like that bird turning it's head smoothly away like "forget this, I'mma preen! Peace out!" Haha

Very much no throughline of concepts from one shot to the next. You never see the same character twice. No foreground dynamic action.. not even simple walking except one far-away character directly away from the camera which means that their silhouette hardly changed.

This all comes from the current generation of video diffusion models that basically just generate an image like they always have except with a hint of temporal coherence they expand that into a short shot with no types of movement except those seen a million times in their training set.

Getting gen models to be able to reason better about motion and to build mental world models of the 3d scene they are managing a 2d window into is going to be a big challenge, and require some additional breakthroughs on a par with the original GPT and stable diffusion breakthroughs that currently act as a foundation to a majority of modern AI innovation.

> ... and require some additional breakthroughs on a par with the original GPT and stable diffusion breakthroughs ...

You say this like Stable Diffusion isn't a 2022 technology. And not early 2022, but quite late (August). ChatGPT is younger.

I mean sure we need more breakthroughs, but we've barely even seen a new hardware generation since those things came out and the researchers are really only getting started with the new capabilities of generative tech. If we don't get more breakthroughs in short order then that would be a stunning halt of progress, a breaking stop the likes of which we have almost never before seen. More breakthroughs are a given.

An interesting example. It may be because I consider myself a fan of animation (moreso than the average person), but the video has obvious garbage less than fifteen seconds in, with the spaceships (?) morphing and sludging around the pyramid.
Sure, that's why I said casual viewer and not careful viewer. But getting back to your original point, would you say it was foul and unpleasant? That's really what I'm claiming, that we're fairly quickly advancing beyond the old days of those nightmare Nekobuses and vomit-inducing clips of Will Smith devouring spaghetti, and into territory where at least some people can find the product genuinely enjoyable. Of course nothing will ever be perfect. AI aside, after all these years it's still often jarring when computer physics is shoehorned into cartoons/anime that's designed to look like traditional hand drawn animation.
> would you say it was foul and unpleasant?

If I were to watch 90+ minutes of that with dubbed voices on top of it, absolutely. There's practically zero cohesion between any of those shots. No real action, no real narrative. It's a collection of non-cohesive stills that were stretched, not any bit of a story at all.

Not sure why you're downvoted. This is one of the most objectively true things said here. CGI was pretty crappy for at least the first few decades of its existence. Even aspects of the animation in Toy Story really show that film's age. I remember realizing that in the early 2000's. Most people either forgot or didn't even experience the early days of CGI and would consider much of it to be nightmare fuel today.

AI is pretty clearly advancing orders of magnitude faster than CGI has. Just because it sucks now doesn't mean it's going to suck in another 5 years.

"Just because it sucks now doesn't mean it's going to suck in another 5 years."

We will see. Some flaws might be baked in, like LLM's halucinating. That won't go away, unless we invent a new tech. So here with generating videos, will morphing objects for example ever go away? I am sceptical with the current approach.