Hacker News new | ask | show | jobs
by verusfossa 3729 days ago
I'm waiting for the day this is just a library you pass an image to and it returns an array. No, not a SaaS. Then on my own pump.io, diaspora, redMatrix etc. it just works. My data, my images, my network. I'm not against the tech at all though. Neat
1 comments

There are already pre-trained networks out there. TensorFlow comes with an example command line tool that you can pass any image and it will tell you what is in the image.

The classes that it can detect are from ImageNet, so that might be limiting.

If you were to train your own net, ImageNet is one of the biggest and most complete datasets and you'd surely use it for training. The alternative is to make your own training set, which will cost you money and/or time. For a proof of concept or initial prototype (until your business can pay for it), those classes should be enough.
ImageNet is not going to have something like "smile" in it's dataset like they showed in the video. It has all kinds of possible dog breads instead.

Maybe someone should create a website that lets volunteers label images for this purpose.

Yeah, in "Person, individual, someone, somebody, mortal, soul" there's the deeper category "smiler" which contains the categories "smirker" and "simperer". Some of the images are (note: my idea was to link to the images directly, but it isn't loading, so I had to take a screenshot and upload to imgur):

* http://i.imgur.com/Wex6pSR.png

Which I say is pretty much what you'd need to train a net to detect people smiling (amongst other things). Of course, there are some refinements you can make to improve accuracy and presentation of the results. My point was: you should begin with datasets that are readily available, and then improve on need (and if resources are available to justify the investment).

See visual genome for the next level beyond imagenet. http://visualgenome.org/