Hacker News new | ask | show | jobs
Show HN: Hotshot – 4 Person Team Builds a State of the Art Video Model (hotshot.co)
26 points by kcaluru 675 days ago
Hi HN! We're proud to share Hotshot, a large-scale diffusion transformer model for text-to-video generation that we built with just a 4-person team. You can try it today in beta at https://hotshot.co, with 2 free generations per day.

The model generates 5 seconds of 720p video from text prompts. It excels at prompt alignment, and consistency. It also excels at generating people, animals, and nature.

In blind tests with 100 users, Hotshot generations were preferred to Runway ML 60% of the time. Hotshot generations were preferred to Luma 80% of the time. Overall, users preferred Hotshot's results to other publicly available text-to-video models ~70% of the time.

We built this model from scratch with a 4 person team in just 4 months. It is trained on 600 million video clips and 1 billion images. It uses a custom-trained video captioner for better temporal understanding and a custom autoencoder for efficient long sequence training.

We've detailed more technical aspects of the journey in a blog post: https://hotshot.co/release

Some technical highlights include A. scaling to thousands of GPUs, tackling infrastructure and optimization challenges. B. developing custom kernels and data parallelism techniques. C. Creating a Watchdog system to detect and respond to GPU process hangs. D. Optimizing data streaming and compression for efficient training.

We believe that this model is just the beginning. In the next 12 months, entire YouTube videos will be AI generated by creators. Text to video models like this one lay the foundation for this and much more. Control over every aspect of generations, longer durations, higher resolutions, real time interactivity, and more modalities (like audio!) are just around the corner.

We're here to answer any questions about the model, our training process, or our plans for the future. We're also always looking for talented individuals to join our team!

We'd love for you to try our 2 free generations per day and let us know what you think. We're excited to see what the HN community will create with it!

6 comments

The quality of output is best so far

https://optimus.hotshot.co/shot/PFhq

Wait why does the title say you built “Sora”? Isn’t that the OpenAI project?
This is mindblowing, both in terms of quality and how quickly it was built with lean resources. Congrats on the launch!
Great to see this launch -- excited to see what you guys do!
i’ve tried a bunch of the models and this one is unbelievably good! congrats - can’t believe this was done by a 4 person team
Kudos on this launch. Much awaited!!