|
|
|
|
|
by mlthoughts2018
2745 days ago
|
|
The same thing is also true for computer vision models. A core deep network usually trained either with a dual embedding of associated text against search ranking or trained to predict tags or labels. The output network may be of limited use on the original training task but ends up producing an excellent embedding model by extracting the neurons from some deep layer. You start automatically encoding your entire image collection and incoming images into that embedding model and rely on it as a lingua franca on which to base all sorts of other companion models like object detection, face recognition, gender/age/ethnicity prediction, spam detection, aesthetic / composition appraisal, caption generation, style transfer etc etc. |
|