Hacker News new | ask | show | jobs
by dr_zoidberg 3729 days ago
If you were to train your own net, ImageNet is one of the biggest and most complete datasets and you'd surely use it for training. The alternative is to make your own training set, which will cost you money and/or time. For a proof of concept or initial prototype (until your business can pay for it), those classes should be enough.
1 comments

ImageNet is not going to have something like "smile" in it's dataset like they showed in the video. It has all kinds of possible dog breads instead.

Maybe someone should create a website that lets volunteers label images for this purpose.

Yeah, in "Person, individual, someone, somebody, mortal, soul" there's the deeper category "smiler" which contains the categories "smirker" and "simperer". Some of the images are (note: my idea was to link to the images directly, but it isn't loading, so I had to take a screenshot and upload to imgur):

* http://i.imgur.com/Wex6pSR.png

Which I say is pretty much what you'd need to train a net to detect people smiling (amongst other things). Of course, there are some refinements you can make to improve accuracy and presentation of the results. My point was: you should begin with datasets that are readily available, and then improve on need (and if resources are available to justify the investment).

See visual genome for the next level beyond imagenet. http://visualgenome.org/