Hacker News new | ask | show | jobs
by sjaknanxnnx 2521 days ago
I’m concerned because probability is not intuitive.

Suppose a store is robbed, and there’s a video.

The police identity some suspects - the guy who just got out of jail for robbing the same store, and another person the store owner had a dispute with. Neither of them look like the robber in the video. Then the police take a still from the video and knock on some doors around the block. Somebody recognizes the person in the video, and the police investigate that person. This scenario seems pretty fair to me.

Now suppose the police run it through the facial recognition system. It identifies one person as a 99% match, and the police go investigate this person. This scenario does not seem so fair to me.

Here’s how I see the math:

P(A) = P(robber has a doppelgänger living on the same block) = .01

P(B) = P(robber had a doppelgänger somewhere in the database) = .9

P(X) = P(police screw up investigation, and will convict the suspect whether or not they are guilty) = .2

P(AX) = .002

P(BX) = .18

The exact numbers are made up, but as long as P(A) << P(B), you can see you this tech will result in a huge increase in false convictions. Even if P(X) is low, the number of false convictions increases by P(B)/P(A).

1 comments

I'd argue the %s are not intuitive either way. In fact, if P(A) does happen, and there really is an unlucky doppelganger, that person is very likely to be charged. That could have been avoided with technology produced gave 5 other suspects that don't live on that block. Should it therefor be allowed for criminal defense if not prosecution?

The issue with the P(A) and P(B) argument is that police already use databases heavily, and most people don't have any problem with it. But why when it comes to facial recognition, is it too dangerous to use technology to drive efficiency.

If they're looking for somebody named Jane Doe, anybody with that name shows up on a list and police investigate. Of course if there are Jane Does in a 2 mile radius, they start with those. So why not just say if the system delivers a match within x accuracy and the person is within y residents (plus a variety of other variables), and x/y is below a threshold, then the match can be presented to police for further investigation.

Searching databases for matches is fine for names, or fingerprints, shoe prints, tire track, fiber analysis - but not faces? I personally wonder if it's really any different, or if its just better tailored for the media outrage machine because "China does it", or because "facial recognition targets minorities".

I think police searching databases for low-quality evidence like tire track, shoe prints, and fiber analysis is a very dodgy practice, for these exact reasons. The key is the specificity of the match, and I don’t think facial recognition is good enough. Fingerprints and DNA can be, but there are still known cases of people being falsely charged based on databases searches with a partial match.

This tech can be good if applied to a narrow range of people like you suggest (eg. only searching people who live in neighboring blocks) but nobody is actually doing that. We should pass laws requiring a rigorous analysis of these probabilities for such databases to be used, including a conversation about what rate of false positives we are willing to tolerate. Guardrails should be put in place to enforce those limits. If this is too hard, we don’t have a strong enough handle on this technology to be using it.

Here’s the scenario that scares me the most:

Police identify a suspect using facial recognition. Then puts that person in a lineup for a witness. Of course the witness is going to say “that’s the one!” because the suspect actually looks like the perpetrator. The witness will be sure, the cops will be sure, and a jury will convict. And this scenario is completely determined by the use of the facial recognition database. This will happen unless we pass laws to prevent it.