| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by makomk 852 days ago
	As I understand it, diffusion-based video generation models simply are not casual in this way. They work by modifying the previous frames in the video to be consistent with future frames just as much as they do later frames to be consistent with earlier ones. That's why Yann LeCun can argue that they do not have to be able to generate plausible continuations of a real video, just generate some arbitrary sample from the space of plausible-looking videos, and that the latter does not imply the ability to do the former. It's also why it's not possible to just generate videos of arbitrary length and lots of VRAM is required to create even a relatively short clip.

1 comments

fancyfredbot 852 days ago

Thank you. That would make sense. Perhaps Yann's target audience are supposed to know this already, but your explanation actually cleared up a misunderstanding for me.