| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by bdamos 3905 days ago

Yes, the processing pipeline first does face detection and a simple transformation to normalize all faces to 96x96 RGB pixels. Then each face is passed into the neural network to get a 128 dimensional representation on the unit hypersphere.

For a landscape, face detection would probably not find any faces and the neural network wouldn't be called.

And an image with multiple people will have many outputs: the bounding boxes of faces and associated representations.