Hacker News new | ask | show | jobs
by levlandau 4254 days ago
I think you'd need something slightly more complex than a "word2vec" since images already have a well defined "word vector" i.e. a pixel. What you want is a "parser" that can take in an image and spit out the significant parts of it? Stanford might have the code up from this paper ( http://machinelearning.wustl.edu/mlpapers/paper_files/ICML20...) up on their site.