Hacker News new | ask | show | jobs
by AureliusDreamer 1492 days ago
I would love to hear what kind of model this is using. Some type of transformer?
1 comments

Yes, it's a "zero-shot" deep learning model, it's using both image and the text from the page to make predictions.