Hacker News new | ask | show | jobs
by rahuldottech 2378 days ago
If a friend clicks a photo of me and uploads it to Google Photos, IMO, it's not okay for Google to use my face to train models without explicit permission from me.

Unfortunately, as is often the case with technology, laws have not kept up with the lastest developments, and likely will not in my country for several more decades. Welp.

4 comments

Just to be pedantic in the spirit of HN. Google isn't training models w/your face from your photo library. The way face recognition works is that Google would collect a dataset somehow and label that for the various feature and train a model for face recognition. Usually this is done with a carefully curated dataset that would be sure to include various ages, genders, ethnicities, lighting conditions, angles, and camera types.

When you use Google Photos, it is using that pre-trained model to determine the features of the faces it finds in your library and it builds a vector, which is just a long string of numbers (also known as a face template or feature vector) that represents each face. Through various machine learning techniques it is able to compare 2 vectors to see how close those 2 faces are alike. If the confidence score it finds is higher than some predetermined threshold (say 70%), it is assumed they are the same person. Running these comparisons over and over through all the photo pairs, the software can group or cluster faces so that it knows all these photos have person 1 and these photos have person 2. Google never knows who those people are, unless you tag those images with a name.

The images in your camera roll aren't used for re-training the original model because Google doesn't know the ground truth about your photos. Google can guess that these 3 faces are the same, but it doesn't know for certain that they are, so they can't use that to retrain the model that would be used in the Photos app because they have no way to judge the accuracy.

Another interesting point is that the vector is also unique to the specific model that was used to create it. So, if in the future they do retrain the model, the vectors that had been created with previous models would be 100% incompatible with the new model and would need to be recreated from the source image.

Note: I have no inside knowledge of Google, but as the former CTO of a facial recognition company, I have a good idea how these systems work in general.

Google absolutely allows you to confirm it's tags and uses that for retraining, which means yes, my facial profile is collected, stored and used for model training (unless you disable it in preferences).

You can't do "celebrity" recognition from a generalized data set.

Logically, that doesn't seem likely because that would mean any individual or set of individuals, could enter false data and poison Google's model going forward.
Google can guess that these 3 faces are the same, but it doesn't know for certain that they are

uCaptcha V3: “Click the people you know.”

How do you even start to write a law for something like this? And I mean a law that makes sense and takes in account the reality of the situation, not one used as grandstanding.

It becomes really messy really fast. A law is established at some local level (local to a borough, state/canton, country), it will surely contradicts with laws from other places while overlapping with them.

From a very abstract view, companies will need to identify the person uploading the picture, the person in the picture, somehow determine which law to follow in the given circumstances (which depends on the context), determine if a consent exists at the correct local level for each person in the picture, then and only then they can train a model.

Correct. And the answer is "You don't," and it doesn't appear BIPA should cover the situation in question.
Just to be sure, are you saying that "you don't create such a law", or "you don't train models on human faces"?
In the sense of covering photographs in general, you don't create such a law. It's completely impractical to enforce (since photographic capture of faces is already ubiquitous in American society).

One could, hypothetically, make a law against training models on human faces. Good luck crafting that carefully enough to enforce it without undesired consequences (did we just ban training doctors on how to recognize stroke victims, or---worse---ban someone from making an automatic stroke detector that could be run on incoming patients in an ER to accelerate them to the front of the line?), but it's a better starting point than banning photographic collection of data.

Let's assume for a moment that the photo was taken in public.

Normally, you don't have much expectation of privacy in public spaces. Generally, a photographer is free to take photos in public and use those photos as they see fit. Why is Google's use of your photo different than the photographer's use of the photo?

Why is Google's use of your photo different than the photographer's use of the photo?

The law doesn't protect your image, it protects your biometric information.

To extend your flawed analogy, a photographer isn't allowed to take a gigapixel photograph of someone in public and then use the data from their fingerprints or iris to uniquely identify them.

That sounds like a distinction without a difference.

I look at a photo, my brain tells me the photo is of "Bob", he's caucasian, male, with brown eyes. I enter this information in a spreadsheet.

Google algorithm looks at a photo, algorithm tells me the photo is of "Bob", he's caucasian, male, with brown eyes. I enter this data in a spreadsheet.

All Google has done is automate a process we were already capable of doing manually.

Edit - in my mind, the problem/question is people using photos commercially (or granting Google the rights to use commercially) photos they don't have legal rights to use that way. This would mostly apply to photos taken in private places. Photos taken in public can usually be used commercially by default.

Would the photographer be allowed to identify using your face? This is pretty commonly done in newspapers.
> Let's assume for a moment that the photo was taken in public.

This is a wrong assumption to make. What if the photos were clicked in my house? In a private gathering?

If it was taken in private, you have an expectation of privacy and the photographer is already legally prohibited from using the photo for commercial purposes. The photographer already should have obtained a commercial release.
so they can't identify you, until they identify you and get your permission to identify you?
> so they can't identify you, until they identify you and get your permission to identify you?

No; by default they (should) have no right to my personal data unless I explicitly opt in to it—and they shouldn't have a right to find me to ask me to use it, either.