Hacker News new | ask | show | jobs
by GaggiX 370 days ago
Very nice being able to read an actual paper on a powerful text-to-video model.
1 comments

Yeah. It is great. So apparently separating spatial / temporal attention works if you are careful and train with large enough dataset too!