|
|
|
|
|
by theonlyklas
1928 days ago
|
|
I believe that these descriptors are created only based off the visual image: A face descriptor is obtained from the learned networks as follows: the centre 224 × 224 crop of the face image is used. The shorter side is resized to 256, and the CNNs descriptor is computed for this region by extracting the deep features from the layer adjacent to the classifier layer. This leads to a 2048 dimensional descriptor, which is then L2 normalised. https://www.arxiv-vanity.com/papers/1710.08092/ |
|