| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by conjectures 3529 days ago

Thanks for the response Cornelia.

I can see that the Computer Vision API does return some useful information. E.g. it appears to discriminate well between abstract images and photos. I appreciate the inclusion of scores with the returned information.

However, the captioning reliably produces odd results. I Googled, "Italian guy eating pizza." To fit the person verbing a common noun model. This was the first non-cartoon image for me:

https://s-media-cache-ak0.pinimg.com/564x/68/c6/cf/68c6cf87b...

And the caption:

{ "type": 0, "captions": [ { "text": "a man and a woman eating a plate of food", "confidence": 0.44831967045071774 } ] }

The woman in question is, I presume, the small statue of the Virgin Mary stood next to the pizza.

There were also a few things I thought would fail but didn't. E.g. distinguishing preparing food from eating it. This was nice.