Hacker News new | ask | show | jobs
by pesenti 4145 days ago
The top 3 classes in your example are actually correct - it is a color photo of a human. But we expect it to get much better over time. Only real world usage will allow us to make real improvement - and that's why we are eager to release early.

We are also believe that the first applications (e.g., classifying animals or plants or landmarks in dedicated apps) will have narrower use case that give better accuracy.

2 comments

The top 3 may be correct, but they aren't very useful. What could I do with this information? What feature could I build?

Also, the other results are very wrong. (i.e., Watson is more confident that this is a dog than a person. And I have no idea where it got "Long Jump" from). This makes it hard for me to trust Watson.

Is the recommendation that I incorporate a "confidence in Watson" metric, and ignore most of the results?

What confidence from Watson would you say indicates an answer that is probably accurate? And how confident are you that Watson's self-reported confidence is accurate?

I tend to disagree. Assuming they are correct on a larger corpus you can start doing things like "only do face matching on pictures with people in them" and weed out photos in a batch that don't have those three properties.

Watson is a training API rather than say the more fanciful emergent AI type API. More data, the better it gets. It is like Google's voice recognition isn't good because someone coded the magic constants for various accents, rather it is good because Google fed it millions of samples of spoken words and corrects it when they get it wrong.

Thanks for your comment. This makes sense - I would use Watson to determine which photos have humans at all, and then run those through, e.g., my facial recognition software. But Watson would keep me from having to waste resources looking for faces in photos of trees, for example.

I'm not in this field, so I'm having trouble understanding what use cases / consumer facing features this API unlocks. Your comment is very helpful in that regard.

It's actually very useful if it can detect with reasonable confidence that there is a person in a picture.

One example of a use is at Kiva we require borrowers to have a picture of themselves posted for their loan. But sometimes we get pictures of things like goats or cows instead (those are kind of nice to but gotta follow policy). Currently this is something we have to manually review for, but if we could automate that review piece it would save a lot of time (especially if at some point it could also count the number of humans in a photo).

Check out Clarifai. They have an image recognition API. It might be able to help detect people in the photo.
Why confidence that it is a human is higher than confidence that it is a placental mammal, and confidence that it is a placental mammal is higher than confidence that it is an animal? More specific descriptions must have less confidence.

Or Watson is not confident that humans are placental mammals and placental mammals are animals?