| You've definitely hit on something here. > That said, I don't think having AI write the scripts from scratch is the right way to go here. The dialogue for the first episode still smells of RLHF, with characters being far too complimentary to each other and having bizarre verbal ticks. And is it needed? The world is full of people with smart stories who want to tell them, but we're in an era when reading is in decline. I'm not sure that it's right to say that the scripts are written "from scratch" -- the "Bible" for the series is hand-written. From Part 2 of the blog post: > Episode generation is autonomous, but the show bible is human-made. The prompts and code that control the LLM are human-made, too. Each episode’s output is closely reviewed by humans. Because models often change, and each new episode tends to reveal bugs/weaknesses in the system, prompts get tweaked by humans, too. This is less and less necessary as more episodes are produced. If the hierarchy goes Bible (series) -> Synopsis (Episode summary) -> Script (scene details), then the author is hand-writing #1, and you're suggesting humans hand-writing #3. > So the most interesting part of this is all the tooling that comes after that point: the rug smoothing, the AI-generated voice acting and especially the game engine based renderer that can generate videos given simple instructions. The blog posts sort of glide over that part, I guess due to the author's background in game engine development, but it seems the most useful part actually. The visualizer / generator certainly is the most novel and useful part of this. I had the same struggles / hangups with the overly-complimentary dialogue in E1 as you did, and it smells much of GPT-4. That said, I agree with the author -- this feels like the first "self-hosting" version of this entire pipeline. Steve Newcomb wrote an article on the idea of taking the lessons learned from CI/CD pipelines and applying them to movie development: https://stevenewcomb.substack.com/p/a-whole-new-way-to-creat... Now that the OnScreen system is "self-hosting" (maybe not the right analogous word) and producing the entire movie when clicking "build", it's possible to hand-tune things as needed to realize a vision -- with whatever level of detail and abstraction that the author would want -- whether it's at the "Bible" level, or on a more detailed note. |
I am planning on doing some more articles/director commentary as it goes along.
I have a number of episodes in the queue and each one is better than the last. My plan is to release an entire season of 12 or so.
The "I'm a GPT that wants everyone to be friends and how" is increasingly better in those episodes.
Even incremental improvements in stuff like background music make a big big difference.
I really want to do a v2 that is more of a "copilot" than an "AI first" experience. But I need partners to help with funding; I've taken it about as far as I can on a solo basis. The next step is a team of 4-5 people levelling it up. Every piece could be 10x better, and it would be a different beast entirely if that happened. I think there are some super exciting directions this could go.
The vision of a distributed creator system is very interesting, as is letting people do more hands-on writing/rewriting.
If any VCs are reading, I'd love to talk. :)
(PS - Hi Han!)