Hacker News new | ask | show | jobs
by htrp 858 days ago
> We empirically find that training on videos at their native aspect ratios improves composition and framing. We compare Sora against a version of our model that crops all training videos to be square, which is common practice when training generative models. The model trained on square crops (left) sometimes generates videos where the subject is only partially in view. In comparison, videos from Sora (right)s have improved framing.

Every cv preprocessing pipeline is in shambles now.