|
|
|
|
|
by snek_case
1489 days ago
|
|
It's hard to say because even a model like GPT-3 is limited in its ability to generate a textual story that remains coherent over time. When you're talking about generating video, you need to have lots of story and visual details remaining coherent over a very long time horizon. Generally speaking, I think this is an area where deep neural networks are fairly weak and symbolic AI shines. It's much easier to program a symbolic AI that generates a story that remains coherent over time. Though you might argue that the story probably wouldn't be very interesting. There's probably something to be done with a hybrid model that uses symbolic AI to enforce coherency constraints, and a deep network model that fills in details and generates visuals. So yeah, I think we'd need new, much more sophisticated architectures. We'd also need a lot more compute, like 10x, 100x or maybe even 1000x more, to generate high-resolution video. Actually, the problem is probably not the amount of compute you need for inference, but the amount of compute you'd need to train a model with hundreds of billions of parameters or however much is needed to make that happen. |
|
So instead of asking the AI to write a novel in one go, why not guide it through a similar process? At each step, pass in information from previous steps as context, focusing on just the details it needs at that step. Have it generate a summary, then a setting, then characters in that setting, then break the plot into chapters, and then scenes, and so on…