Hacker News new | ask | show | jobs
by mkagenius 543 days ago
I am working on automation of phones (open source) - https://github.com/BandarLabs/clickclickclick

I haven't been able to quite get the Llama vision models working but I suppose with new releases in future, it should work as good as Gemini in finding bounding boxes of UI elements.