Hacker News new | ask | show | jobs
by subho406 2529 days ago
The model was trained on video classification, image qa and image captioning. Video captioning and video qa is not trained, yet the model shows results on those tasks.