Y
Hacker News
new
|
ask
|
show
|
jobs
by
marmadukester39
1254 days ago
Is it? Videos are just sequences of frames
1 comments
rdedev
1254 days ago
Each frame of the image would have to be divided into many sequences. Atleast that's how transformer based image models work. Then you have to account for audio data too in the same way. It just blows up the compute required
link