Hacker News new | ask | show | jobs
by LuisMondragon 1727 days ago
Not the same. CLIP is trained with pairs of images and texts, whereas VideoCLIP uses pairs of videos and texts.