| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by snek_case 1489 days ago
	It's hard to say because even a model like GPT-3 is limited in its ability to generate a textual story that remains coherent over time. When you're talking about generating video, you need to have lots of story and visual details remaining coherent over a very long time horizon. Generally speaking, I think this is an area where deep neural networks are fairly weak and symbolic AI shines. It's much easier to program a symbolic AI that generates a story that remains coherent over time. Though you might argue that the story probably wouldn't be very interesting. There's probably something to be done with a hybrid model that uses symbolic AI to enforce coherency constraints, and a deep network model that fills in details and generates visuals. So yeah, I think we'd need new, much more sophisticated architectures. We'd also need a lot more compute, like 10x, 100x or maybe even 1000x more, to generate high-resolution video. Actually, the problem is probably not the amount of compute you need for inference, but the amount of compute you'd need to train a model with hundreds of billions of parameters or however much is needed to make that happen.

3 comments

thorum 1489 days ago

I suspect the solution to keeping a long story coherent is using the model at different levels of abstraction. A human writer doesn’t sit down and write a complete novel in one sitting. They go through a process of planning, character development, world building and so on. When they write a scene, they’re not holding every detail about the rest of the book in their mind, they’re narrowing down to the details that matter for that scene.

So instead of asking the AI to write a novel in one go, why not guide it through a similar process? At each step, pass in information from previous steps as context, focusing on just the details it needs at that step. Have it generate a summary, then a setting, then characters in that setting, then break the plot into chapters, and then scenes, and so on…

link

Shocka1 1488 days ago

Yup, this makes a lot of sense to me. I could even see models broken down by director/film. So many combinations could be used - the possibilities are endless. A Tarantino model like that of Pulp Fiction might be a good one.

link

labster 1489 days ago

Having a story remain coherent over time is not a prerequisite of Hollywood blockbusters.

link

skocznymroczny 1489 days ago

Reminds me of that South Park episode in which Cartman disguises himself as a robot to prank Butters, but movie executives confuse him for an actual robot and make him think up movie ideas.

"Adam Sandler is like, in love with some girl, but then it turns out that the girl is actually a Golden Retriever. Or something."

link

avereveard 1489 days ago

Incoherency is not all bad, and choosing the subject carefully can be enough for the intrinsic weirdness of ai generation shine.

I.e. this batman short: https://m.youtube.com/watch?v=fn4ArRmzHhQ (ai story and human drawing) provide a spot on jocker

link

stavros 1489 days ago

Those "AI writes" are more a meme than actual AI. GPT-3 writes perfectly grammatical sentences but the stories don't make much long-term sense, which is the opposite of what the story in the video has.

These videos are mostly "human writes funny, says it's AI".

link

taftster 1489 days ago

Right. The Transformers comes to mind.

link

outworlder 1489 days ago

> It's hard to say because even a model like GPT-3 is limited in its ability to generate a textual story that remains coherent over time.

Just like most dreams. Can still be entertaining.

link

ravi-delia 1489 days ago

In a limited capacity. Any dream I'm even a little aware of becomes extremely boring, frustrating, and claustrophobic. Even the good ones become nightmares without anything changing. That might just be me though, maybe GPT could cook up dreams which don't suck

link

marc_io 1489 days ago

Oh, but dreams — even the most wild ones — are coherent, specially over time. They're just not usually obvious to the ego mind.

link