| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by coleca 2378 days ago

Just to be pedantic in the spirit of HN. Google isn't training models w/your face from your photo library. The way face recognition works is that Google would collect a dataset somehow and label that for the various feature and train a model for face recognition. Usually this is done with a carefully curated dataset that would be sure to include various ages, genders, ethnicities, lighting conditions, angles, and camera types.

When you use Google Photos, it is using that pre-trained model to determine the features of the faces it finds in your library and it builds a vector, which is just a long string of numbers (also known as a face template or feature vector) that represents each face. Through various machine learning techniques it is able to compare 2 vectors to see how close those 2 faces are alike. If the confidence score it finds is higher than some predetermined threshold (say 70%), it is assumed they are the same person. Running these comparisons over and over through all the photo pairs, the software can group or cluster faces so that it knows all these photos have person 1 and these photos have person 2. Google never knows who those people are, unless you tag those images with a name.

The images in your camera roll aren't used for re-training the original model because Google doesn't know the ground truth about your photos. Google can guess that these 3 faces are the same, but it doesn't know for certain that they are, so they can't use that to retrain the model that would be used in the Photos app because they have no way to judge the accuracy.

Another interesting point is that the vector is also unique to the specific model that was used to create it. So, if in the future they do retrain the model, the vectors that had been created with previous models would be 100% incompatible with the new model and would need to be recreated from the source image.

Note: I have no inside knowledge of Google, but as the former CTO of a facial recognition company, I have a good idea how these systems work in general.

2 comments

vorpalhex 2378 days ago

Google absolutely allows you to confirm it's tags and uses that for retraining, which means yes, my facial profile is collected, stored and used for model training (unless you disable it in preferences).

You can't do "celebrity" recognition from a generalized data set.

link

coleca 2378 days ago

Logically, that doesn't seem likely because that would mean any individual or set of individuals, could enter false data and poison Google's model going forward.

link

asudosandwich 2378 days ago

Google can guess that these 3 faces are the same, but it doesn't know for certain that they are

uCaptcha V3: “Click the people you know.”

link