Hacker News new | ask | show | jobs
by yummyfajitas 3685 days ago
According to propublicas own analysis, the claim of bias cannot be shown to be statistically significant. https://www.propublica.org/article/how-we-analyzed-the-compa...

This article is terrible data journalism and probably deliberately misleading.

Step 1: write down conclusion.

Step 2: do analysis.

Step 3: if analysis doesn't support conclusion, write down a bunch of anecdotes.

Really, here's her R script: https://github.com/propublica/compas-analysis/blob/master/Co...

Just read that. It's vastly better than this nonsensical article.

2 comments

They analyzed what they could -- the outcomes of the algorithm (recommendation) and the accuracy of those recommendations. They picked out specific examples, but the analysis was over the whole data set. I think you missed these relevant parts from the article:

> We obtained the risk scores assigned to more than 7,000 people arrested in Broward County, Florida, in 2013 and 2014 and checked to see how many were charged with new crimes over the next two years, the same benchmark used by the creators of the algorithm.

> The score proved remarkably unreliable in forecasting violent crime: Only 20 percent of the people predicted to commit violent crimes actually went on to do so.

> The formula was particularly likely to falsely flag black defendants as future criminals, wrongly labeling them this way at almost twice the rate as white defendants. White defendants were mislabeled as low risk more often than black defendants.

> Could this disparity be explained by defendants’ prior crimes or the type of crimes they were arrested for? No. We ran a statistical test that isolated the effect of race from criminal history and recidivism, as well as from defendants’ age and gender.

> Black defendants were still 77 percent more likely to be pegged as at higher risk of committing a future violent crime and 45 percent more likely to be predicted to commit a future crime of any kind.

Go read the description of the statistical analysis or just view their R notebook:

https://github.com/propublica/compas-analysis/blob/master/Co...

Their own analysis shows that (p ~= 0) that high and medium risk factors are predictive. They also showed that the racial bias terms (race_factorAfrican-American:score_factorHigh, etc) are probably not predictive (p > 0.05).

Your quotes are not evidence of bias, though I see how they might confuse an innumerate reader. It's interesting how good a job this article is doing confusing the innumerate - it's almost as if it was written to mislead without technically lying.

For example, black defendants being pegged as being more likely to commit crimes can be caused by one of two things: bias or perhaps black defends actually are more likely to commit crimes. According to ProPublica's own analysis (see race_factorAfrican-American), the latter is actually the case. This is true with p = 4.52e-06 - see line [36].

I read through the entire analysis. It appears that you stopped reading after you saw a p-value that supported your bias. That is bias in the sense of pre-conceived notion. You then proceeded to pedantically argue that the well demonstrated bias of the algorithm (more false positives for blacks than whites about 40% vs 20%) does not exist because of a p-value that came in between 0.05 to 0.1 instead of below 0.05.

Please let me know when your reading comprehension catches up with your mediocre statistics comprehension.

Maybe you just didn't realize that the 20-20 hindsight data -- prediction vs recidivism -- is included right there in the analysis. Or maybe you did realize it later and just decided you'd dug in so much that you didn't want to admit your ignorance.

Or maybe you still haven't comprehended the difference between the meanings of the word bias.

(From my above reply too, as it applies here also):

Lets be clear -- if the null hypothesis in this case is true (that there is no bias), and all other assumptions made are true, there is a slightly greater than 5.7% chance of obtaining this result (or something even more skewed). That's a great bar for publication of SCIENCE. It's not a great bar for hiding behind a proprietary algorithm used in sentencing. People talk about misuse of p-values, but this takes the cake.

If you want to criticize the details of her analysis, go ahead. I'm solidly in the Bayesian camp and I agree with you 100%. What I'd have done is computed posteriors on all these coefficients and then computed bayes factors/probability of bias.

I'm confused though; the mood affiliation of your post somehow suggests that her less than perfect choice of a statistical methodology somehow supports her claims. Could you explain that? Or am I simply misunderstanding what you are trying to say?

Also, lets suppose we just take her own analysis at face value, and don't view it through the p-value lens. The maximum likelihood estimate suggests that even if this effect is not random chance, it's not very big. I.e., the "score factor high" estimate is >8x larger than the "score factor high, race = black" estimate. Isn't this really good? Do you really think the human biases that this algorithm mitigates are lower than this?

Lastly, what specific analysis would convince you that this algorithm is predictive and non-biased (or more realistically, not very biased)?

> maximum likelihood

That may be grounds for a mistrial. Decisions about crimes are not judged by the "maximum likelihood".

> what specific analysis would convince you that this algorithm is predictive and non-biased

What is it going to take to convince you that the choice of model and which data to use as input is just as important as the analysis itself?

> race_factor

Depending on the situation, using race or other protected classes is illegal. One of the reasons we have a right to face our accusers is to provide an opportunity to challenge those accusations. Racial (or any other protected class) discrimination doesn't become legal when it is hidden behind an equation or algorithm. If the government wants to keep the method secret, then anything derived from those methods should be excluded.

> human biases

...are off topic. An algorithm needs to justify it's own existence.

> it's not very big

So you're fine with racial bias, as long as it only affects what you consider a "small" number of people.

> or perhaps black defends actually are more likely to commit crimes

/sigh/

What is it going to take to convince you that the choice of model and which data to use as input is just as important as the analysis itself?

I'm already convinced of this. Are you trying to imply that the cox model is wrong or something? If so, why not just make that argument explicitly?

Of course, if the Cox model is wrong, why do you believe the algorithm is biased? Isn't that reason to disregard the entire ProPublica article (which is all based on the Cox model)?

Depending on the situation, using race or other protected classes is illegal.

Did you even read the article? "Northpointe’s core product is a set of scores derived from 137 questions that are either answered by defendants or pulled from criminal records. Race is not one of the questions. "

/sigh/

You can emote all you like. Reality does not change.

I must admit, the emotion on display here confuses me. Much like you I oppose racial bias. The R script provides evidence that very little racial bias is present in this system. Why does this inspire such negative emotion? It's almost as if you care more about looking anti-racist than you care about having racism's effects be reduced.