|
|
|
|
|
by throwaway4233
851 days ago
|
|
While Sora might be able to generate short 60-90 second videos, how well it would scale with a larger prompt or a longer video remains yet to be seen.
And the general logic of having the model do 90% of the work for you and then you edit what is required might be harder with videos. |
|
Most fictional long-form video (whether live-action movies or cartoons, etc) is composed of many shots, most of them much shorter than 7 seconds, let alone 60.
I think the main factor that will be key to generate a whole movie is being able to pass some reference images of the characters/places/objects so they remain congruent between two generations.
You could already write a whole book in GPT-3 from running a series of one-short-chapter-at-a-time generations and passing the summary/outline of what's happened so far. (I know I did, in a time that feels like ages ago but was just early last year)
Why would this be different?