Hacker News new | ask | show | jobs
by tw000001 2203 days ago
>Fast Company [2] writes about this as well: "The ACLU in both tests used an 80% match confidence threshold, which is Amazon’s default setting, but Amazon says it encourages law enforcement to use a 99% threshold for spotting a match

Then this whole thing is potentially misleading because there's a huge difference between 80% and 99%. It's probably nonlinear and they could possibly see their false matches drop to 0. This is not a fair test - or rather, the conclusions are not quite supported by the parameters.

Not that I'm defending police use of facial recognition tech, I think it's abhorrent, though possibly inevitable.

3 comments

They made a facial recognition tool available to law enforcement and in the marketing it says "requires no machine learning expertise to use" then I think it's fair to look at any value of the threshold parameter they make available. Especially a parameter that, by changing it, will give you the answer you want more often.

I'm deeply troubled by the text I've seen here implying this threshold is some accuracy percentage or positive predictive value percentage. Unless God is working behind the scenes at AWS they can't make any claim about the accuracy of the model on an as yet unseen population of images.

That's even before getting to the more esoteric map vs territory concerns like identical twins, altered images, adversarial makeup and masks, etc.

Just to make sure I understand, which "whole thing" is misleading? The ACLU's test? Amazon's response?

As for the test, you say it's not a fair test. The point / conversation right now seems to be about the choice of parameters used by the ACLU. As far as I see / understand, the ACLU used the default parameters (and/or those recommended in the documentation / articles that are still up today with those same non-99% values).

What would have been a better / fairer test?

What are police departments using? My uninformed guess would be not 99%. I think therein lies the concern...
My cynical guess would be "whatever the lowest number they can get away with using".

I would bet good money that cops KPI goals benefit from false positives, since they'll reward higher "number of identified/interviewed suspects" and "number of arrests" as a positive thing even if "number of convictions" doesn't line up.

Even more cynically, I'd bet this is a powerful technique for ambitious cop promotion, and that there's little blowback on fraudulently manipulating parameters that adversely affect POC much more significantly that white people.

Thinking about it, I'm now recalling the multiple reports of police departments claiming to not be using clearview.ai, only to have to backtrack when clearview's customer data got popped and it became public knowledge that individual cops were signing up for free trials - which their department/management either chose to hide or didn't know about. That's reasonably compelling circumstantial evidence to me that ambitious cops are quick to jump on unproven and unauthorised technology with insufficient or oversight or with management actively avoiding oversight for them...

In regards to the KPIs this is a known reality. Most states get money from the federal gov highway safety program. Then the states disburse it to local police depts, and the expect high numbers of citations (or even warnings) to be reported back up the chain. It is only for DUI that verdicts are considered, and that's only amongst the smarter states. Related to crime, there are NO KPIs based on the final outcome - all on the elements the police are able to carry out and be accountable for on their own. This makes sense in some ways beyond self promotion. I will say also that the general inflation of KPIs in order to justify promotions, grant renewals, etc is RAMPANT in state and local govs, but especially in policing when it comes to new tech investments and promotions
If they can turn the knob, why wouldn’t they? This stuff isn’t admissible in court, and you can sweep for potential matches to follow up on.

If the default is 80, most will be 80. The SE may say “I’m told to inform you that you should use 99.”, but I’m sure he is winking.

Wouldn't it be more likely that they say "ok, we can interview/investigate/whatever X number of people" and then they adjust the threshold to produce that number? If 80% gives them 10,000 hits and 99% gives them one or none, then nobody is going to just go with either setting.
I'd guess with the potato quality of facial pictures from incidents security or phone cameras, you might want lower confidence matches to get outcomes out of lousy pictures.