Hacker News new | ask | show | jobs
by zerocrates 1046 days ago
The actual quote that the mention in the article refers to:

"Using diverse training sets can help reduce bias in FRT performance. Algorithms learn to compare images by training with a set of photos. Disproportionate representation of white males in training images produces skewed algorithms because Black people are overrepresented in mugshot databases and other image repositories commonly used by law enforcement. Consequently AI is more likely to mark Black faces as criminal, leading to the targeting and arresting of innocent Black people."

So they're saying that simultaneously the training set has too few black faces and the set being compared against has too many.

2 comments

> Consequently AI is more likely to mark Black faces as criminal, leading to the targeting and arresting of innocent Black people.

I don’t see how this relates to simple facial recognition. It doesn’t appear that they’re scanning for “criminal physiognomies” but for specific facial matches.

Furthermore, it seems that this whole line of argumentation implies that facial recognition software may be mistaking innocent Black people for non-Black perpetrators, which I don’t see any evidence for. How does this increase arrest rates for Black people if AI just can’t tell them apart? In all likelihood, the person who got away is also Black.

It doesn't imply that it's matching black people to white perpetrators. The claim is that A) the model itself is worse at matching for black faces and B) the database being searched against is often disproportionately made up of black faces.

Give it a photo of a black person to search on and you're probably getting a black person as a match, but the likelihood that it's actually the same person is lower than it would be if you were searching for a white person.

The quote doesn't say it's increasing arrest rates for black people, but arrest rates for innocent black people. If you use facial recognition and it's 99% accurate for white people and 75% accurate for black people (numbers chosen arbitrarily), you're going to target a lot more black people incorrectly even if you're never incorrectly matching photos of white criminals to black people.

> It doesn't imply that it's matching black people to white perpetrators. The claim is that A) the model itself is worse at matching for black faces and B) the database being searched against is often disproportionately made up of black faces.

Right, I understand that in the context of this specific quote, but the article implies that claim.

> Give it a photo of a black person to search on and you're probably getting a black person as a match, but the likelihood that it's actually the same person is lower than it would be if you were searching for a white person.

Lower, but by how much? The number given here is six in all. It feels very premature to use probably in that sentence. (Edit: misread that as you’re probably going to get a match)

> The quote doesn't say it's increasing arrest rates for black people, but arrest rates for innocent black people.

I meant this quote from the article: “facial recognition leads police departments to arrest Black people at disproportionately high rates.”

But I agree. It seems that there is a disparity in accuracy, it’s very unclear on how much of one but so far it appears that we’re talking about a fraction of a percent. We only have a sample size of six to draw on. We don’t know the demographics of the districts this has been employed in, and it seems strange to assume that they’re the same as the American population at large. I mean the first example is from Detroit.

The article posted to HN in this relevant section for the start of this thread (the part about more/less black people in the data sets) quotes/paraphrases a Scientific American piece (where I got the quote with "innocent" in it from my comment), which itself is based on a paper in Government Information Quarterly.

The paper is what the article here links to when they say that facial recognition leads to disproportionate arrests of black people, the part you're mentioning now. That's a separate finding of the paper from the statements about possible reasons "why" that are based on the training and search sets.

The main thrust of the paper is actually those numbers: they find that black-white arrest disparity is higher in jurisdictions that use facial recognition.

"FRT deployment exerts opposite effects on the underlying race-specific arrest rates – a pattern observed across all arrest outcomes. LEAs using FRT had 55% (B = 1.55) significantly higher Black arrest rates and 22% lower White arrest rates (B = 0.78) than those not implementing this technology."

They do some stuff I'm not really qualified to opine on to try to control for the fact that obviously facial recognition adoption is also correlated to department size, budget, crime rate and things like that. Of course the usual caveats still apply, particularly that they're not claiming or attempting to show causation.

This doesn't rescue their claim. If the suggested class imbalance really exists in the training/test sets, the model will preferentially identify whites as criminals.
The claim is that the model is worse at telling black faces apart from each other.

The system is trained to match images of faces, not identify criminals; it's not comparing things to its training set to give a "criminality" score. The training data is just what has taught the system how to extract features to compare. You run an image of an unknown person against your database of known images, and look for a match so you can identify the unknown person.

If the model is just "worse at" black people, it's going to make more mistakes matching to them.