Hacker News new | ask | show | jobs
by wgd 427 days ago
How charitable of you to assume those examples work reliably.
2 comments

Bemyeyes app already work quite reliability to describe scenes to the blind
Haha, did you evaluate this personally?

I did a BeMyEyes test recently, trying to sort about 40 cans according to the existance of a deposit logo. After 90 minutes of submitting photos, and a second round to make sure it doesn't lie too much, I had 16 cans which according to BeMyEyes (OpenAI) had a deposit logo. Then, I went to the shop to bring them back. Turns out, only 4 cans had a logo. So after a second round to eliminate hallucinations, the success rate was only 25%.

Do you call that reliable?

> I did a BeMyEyes test recently

But isn't the BeMyEyes assisting happening via other humans? I remember signing up for some "when blind people need your help" thing via BeMyEyes and I understood it as it's 100% humans on the other end of the call that will help you.

Yes, what you are describing is how BeMyEyes started, and it still offer that feature.

However, somewhere around 1 or 2 years ago, they added a OpenAI vision model based way to send in photos and have them described.

In general, its a very nice feature, if it works. For instance, I do use it successfully to sort laundry.

But the deposit logo test I did gave horrible results...

That changed a while ago. They also use OpenAI's APIs now.

https://openai.com/index/be-my-eyes/

Are you willing to bet that it wouldn't work reliably in a year, 2 years, 5 years?
If you're releasing something today, should you talk about what it can do now or what it might be able to do in two years?