Hacker News new | ask | show | jobs
by N0b8ez 718 days ago
The article mentions youtube as a source of training data, but seems to only be talking about audio transcriptions (text). But, isn't youtube more useful for multimodal training on the video data itself?