Hacker News new | ask | show | jobs
by docheinestages 6 days ago
This is a helpful method for visually grounding LLMs to take actions on the screen such as clicking. For humans though, hell no.