Hacker News new | ask | show | jobs
by cr4zy 525 days ago
Wow, this is awesome! Thanks for building. I didn't realize there was a protocol for streaming while rendering, though I noticed sumo.ai doing something similar for audio. Gemini with grounding is new to me also, very nice!
1 comments

thanks! Streaming was actually pretty hard to get working, but it goes roughly like this as a streaming pipeline:

- The LLM is prompted to generate an explainer video as sequence of small Manim scene segments with corresponding voiceovers

- LLM streams response token-by-token as Server-Sent-Events

- Whenever a complete Manim segment is finished, send it to Modal to start rendering

- Start streaming the rendered partial video files from manim as they are generated via HLS