| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by leblancfg 3520 days ago

I agree. Although one could imagine that concepts conveyed in a photograph could be extracted and abstracted as vectors -- just like word2vec and its successors. Of course, there is a long way to go before we hit "human understanding" parity, but I think ideas from [1], [2] and [3] could be extrapolated in doing just that.

[1] Deep Visual-Semantic Alignments for Generating Image Descriptions - cs.stanford.edu/people/karpathy/cvpr2015.pdf

[2] Deep Learning for Content-Based Image Retrieval - www.research.larc.smu.edu.sg/mlg/papers/MM14-fp336-hoi.pdf

[3] Deep Learning for Content-Based Image Retrieval - www.cs.rutgers.edu/~elgammal/pub/MTA_2014_Saleh.pdf