Hacker News new | ask | show | jobs
by eeeficus 857 days ago
> Sora is like 80% there, I think we would be 90% there in 2 years time with 5-10x more GPGPU power training.

Don’t underestimate the last 20%! I don’t think this is just a matter of more training or more data. But yeah, 2-5 more years will not change the outcome you’re describing!

3 comments

Speaking as someone who has done VFX professionally, I think Sora is closer about 30% there. It's really not constituted in a way that makes it easy to combine with existing techniques. You either need to come up with a way to make them play nicely together or you have to recreate the entire production pipeline in the model.

It'll get better, but I don't think they've been aiming in the right direction if the goal is to replace current production methods.

Yeah text-prompting only won't replace production work, but it can surely accelerate and boost pre-prod work, maybe doing prelim shots for a storyboard, creative ideas for the costume dept... lots of cool things.

I think for prod work it needs what the auto1111 plugins have given SD - ControlNet etc, where you have a lot more control over the generation and the text prompt is just a small part of it.

Yeah, I don't think people understand how much pre-prod work is done. For the next 5 to 10 years this what it'll be used for. In the next 20 years, 90% are out of work.
The problem is that Hollywood doesn't really care. They live on making compromises during shoots, and if they can save hundreds of thousands as well as dozens of hours and manpower on a scene, they will do it (and they do all the time with horrible CGI, poorly shot scenes, etc...), and they aren't the only industry that will do this.

Right away, the stock footage market is in danger because content creators will opt for a free (good enough) option, over having to not just pay for, and but look for exactly what they want and need.

There's plenty of other industries that will be affected, I can't say that 80% of people will lose their jobs, these things aren't a switch you flip, it's a gradual process, but over time we'll definitely see jobs made obsolete/optimized.

This technology is the worst it'll ever be, and we've seen how far it got in just a single year with Will Smith eating spaghetti and now Sora.

The problem is that Hollywood doesn't really care. They live on making compromises during shoots, and if they can save hundreds of thousands as well as dozens of hours and manpower on a scene, they will do it (and they do all the time with horrible CGI, poorly shot scenes, etc...), and they aren't the only industry that will do this.

Isn't this a problem for Hollywood itself? Isn't the strength of Hollywood the capital to hire good actors and effects people etc?

You seem to be assuming that Hollywood will just fire all the actors and save money, but to me it seems more like Hollywood is in the danger zone too?

The thing is, you run the same prompt twice and you get twice an unique result. Its trivial to keep asking for more renditions of given scene till you are happy with it, even kids can do this and certainly somebody in sweatshop in Vietnam can do that too. You literally need 1 person a bit of computer power. So instead of team of pixar and disney you have 1 guy with the product and some aws prepaid computing.

Now I don't know how long it took openai to put together that 10 minute demo video, if they ran 3x the resulting prompts or 100x more and had to cherry pick hard (ie that bird's head, I was expecting heavy morphing of those feathers but they kept their visual consistency, for sure this was not the first attempt with at-glance-perfect result).

sama sat on twitter during that day and rendered prompts from people on the fly, so those weren't heavily cherry-picked. I didn't watch most of them and they had some bugs for sure, but if they had had to cherry-pick like 1 of 100 for the blog demos he would never have sat and done live renders like that.
Just imagine what it could do with a storyboard