Hacker News new | ask | show | jobs
by kache_ 1342 days ago
Have you considered using ML/OCR to figure out the position of the text relative to the screen? Seems much simpler than relying on accessibility APIs

Thank you for your hard work!

1 comments

I have plans to use ML/OCR to augment results down the road but the AX APIs and ecosystem on most apps (that I encounter, at least) are generally decent. Also, OCR means it won’t understand buttons with just icons, whereas AX APIs can grab em just fine.

Thanks! It’s easily my longest running project at a decade