Hacker News new | ask | show | jobs
by thekyle 2110 days ago
Would this have any advantage over just using video embeddings (or a sequence of frame embeddings?) which in theory should capture those things in vectorized form.