Hacker News new | ask | show | jobs
by eckr 3 days ago
In the past, they just ran Deepseek OCR on your image and extracted the text, then gave it to a language only model. I believe now there is a model that actually takes images as input directly.
1 comments

Talking about the vision... I already had the vision tab there hahahaha I guess everything in tech these days are A/B...