Y
Hacker News
new
|
ask
|
show
|
jobs
by
simonw
276 days ago
Yeah, I've been disappointed in GPT-5 for OCR - Gemini 2.5 is
much
better on that front:
https://simonwillison.net/2025/Aug/29/the-perils-of-vibe-cod...
1 comments
IanCal
276 days ago
Images in general, nothing comes close to Gemini 2.5 for understanding scene composition. They perform segmentation and so you can even ask for things like masks of arbitrary things or bounding boxes.
link