Hacker News new | ask | show | jobs
by mkagenius 522 days ago
Yes, I tried a similar one called "omniparser" - where the issue was it was missing annotating some UI elements sometimes. Moreover, Gemini and Molmo worked right out of the box without needing any fine tune.