|
|
|
|
|
by bdamos
3905 days ago
|
|
Yes, the processing pipeline first does face detection and a simple transformation to normalize all faces to 96x96 RGB pixels. Then each face is passed into the neural network to get a 128 dimensional representation on the unit hypersphere. For a landscape, face detection would probably not find any faces and the neural network wouldn't be called. And an image with multiple people will have many outputs: the bounding boxes of faces and associated representations. |
|