Hacker News new | ask | show | jobs
by lynx23 817 days ago
Well, as a blind user, I'd like to point at the OpenAI Vision integration of BeMyEyes! Being able to get fully detailed scene descriptions including OCR and translation all in one package was pretty much a game changer for me.

Not so much kick-ass, but still works nicely: https://github.com/mlang/tracktales -- My MPD track announcer with support for describing album art...

2 comments

This is an example of great AI UX and not sure why it's not upvoted more. It's rare for users to use the term "Game Changer" when it comes to UX.
Can I ask if you think this might make alt text on images obsolete? Do you use the alt text where it’s available, or BeMyEyes (I presume you have a choice)?
Those are two different topics. BeMyEyes is a smartphone app which brings your camera and OpenAI vision models together. It is ment to be used to describe/OCR things in your real-world environment.

While alt texts could theoretically be replaced by a browser/screen reader functionality that asks a vision model to describe the image, it is a waste of time and energy to have each and every user do it over and over again.

Ah, sorry, got you. The aspect I think about with alt text is that AI is often better than a mediocre human effort, and it is improving all the time. Improving AI will improve the description of all images, even older ones, and therefore you might want to run the current AI on all images, even if they have existing alt text.