Hacker News new | ask | show | jobs
by GrantS 3658 days ago
Images don't necessarily map directly onto a word2vec-like solution but the closest thing is pre-trained deep networks, for example: http://caffe.berkeleyvision.org/model_zoo.html

You're right that bag of words with SIFT is not state of the art, with deep learning dominating computer vision approaches these days.