Hacker News new | ask | show | jobs
by Malarkey73 3685 days ago
One of the most mind boggling sentences in that article was:

"On Sunday, Northpointe gave ProPublica the basics of its future-crime formula — which includes factors such as education levels, and whether a defendant has a job. It did not share the specific calculations, which it said are proprietary."

How on earth can you lock people up based on secret information? That is Kafka meets Minority Report.

3 comments

This is done regularly. It's called "judicial discretion" - a judge uses a neural network so secret that even he doesn't understand it (in fact the entire scientific field of "neuroscience" exists to try and analyze it).

Variables used in the formula include details of the case, race/appearance of the defendant, and how recently lunch was at the time of sentencing. Unlike the ProPublica claims of racial bias (which are merely "almost statistically significant" at the p=0.05 level), the lunch bias is statistically significant at the p < 0.01 level.

http://www.pnas.org/content/108/17/6889.full

This system sounds like a huge improvement.

Just FYI: The lunch paper has very serious problems as described in this reply, also published in PNAS: http://www.pnas.org/content/108/42/E833.full)

In particular, the cases are heard in a particular order. For each prison, the prisoners with counsel go before those who are representing themselves. As in the US, those representing themselves typically fair worse. The judges try to finish an entire prison's worth of hearings before a meal, so the least-likely-to-succeed cases are typically assigned to spots right before a break.

There are some other bits of weirdness in the original data too. They found a statistically significant association between the ordinal position (e.g., 1st, 2nd, ..., last) and the parole board's decision, but failed to find any effect of actual time elapsed (e.g., in minutes), even though the latter is much more compatible with a physiological hypothesis like running out of glucose.

Interesting, I was unaware. I need to associate more uncertainty to my beliefs about how terrible humans are at making decisions.
As you note, the algorithm for judicial discretion is unknown. The algorithm for this software is fully known, just kept from the public.
The validity of the algorithm can be - and apparently has been - reliably tested and been found to be useful and mostly unbiased. This analysis has been performed by both the algorithm's creators and highly adversarial third parties, such as the author of this article. Both found that whatever bias there is is small, and cannot be distinguished from random chance.

For example, the author of this very article has done such an analysis. Here's her R notebook:

https://github.com/propublica/compas-analysis/blob/master/Co...

Her analysis shows (within the limitations of the frequentist paradigm) that:

a) the predictor is useful - score_factorHigh and score_factorMedium both have p-values that are essentially zero.

b) The predictor is not racially biased that much - race_factorAfrican-American:score_factorHigh and the other bias terms have p-values that are > 0.05 .

Look, I'd love it if we required such algorithms to be open source. I'm a huge proponent of both open science and open government. Nevertheless, there is an entire discipline devoted to evaluating predictive algorithms without needing to care about their details - it's called "machine learning".

The wonderful thing about statistics is that even a highly biased person (such as the author of this article) can still reach a correct conclusion that goes against their biases.

Lets be clear -- if the null hypothesis in this case is true (that there is no bias), and all other assumptions made are true, there is a slightly greater than 5.7% chance of obtaining this result (or something even more skewed). That's a great bar for publication of SCIENCE. It's not a great bar for hiding behind a proprietary algorithm used in sentencing.

People talk about misuse of p-values, but this takes the cake.

This is in my professional area, and yummyfajitas is right on certain points. The reason these approaches started taking off at all is because the alternative, subjective decisions, don't generally work as well. There's plenty of meta-analyses to show this; that's why these risk systems get used.

Also, this analysis is certainly a useful addition to the literature on this system, but it's one analysis, and regardless of your philosophical stance on p-values, a p-value of .057 in the presence of multiple testing isn't the most convincing thing.

Having said that, the use of non-open predictive systems is a problem for criminal settings. Maybe this thing is biased, but the only way to find out and fix it is to do these sorts of analyses and have this sort of discussion.

The problem isn't the use of prediction systems, it's the use of them without open academic scrutiny, without correcting any biases that emerge.

but it's one analysis, and regardless of your philosophical stance on p-values, a p-value of .057 in the presence of multiple testing isn't the most convincing thing.

I agree in general. But when you have one data point and it relates to bias in a system a p-value of .057, suggesting there is bias is more compelling than the null hypothesis. Especially when other independent a-priori evidence seems to also point against the null hypothesis.

What if it came out of a neural net or some other system that can't be easily explained? There's no real "specific calculation" to show.

Now if they were using decision trees, i.e. If the person has 3 or more felonies they get a 5 rating, that could be presented.

I'm curious about how much of a feedback loop this process has. The model was probably trained on old data and never updated. Also how does it take into account features that it doesn't know about (the article mentions one guy turning to Christianity)? I doubt if there is a mechanism for people to be asked why they did or did not reoffend. Even if they did how much should it be trusted?

Neural nets may be opaque but they are not secret.
I have this very concern about using SVM in medical research.

I also worry greatly about diagnostic predictive models that maximise overall prediction success but don't balance the relative consequences of false positives and false negatives.

I'm leaning towards a Constitutional Amendment against automated law. The wording escapes me (and I'm unqualified anyhow) but the gist would be that only humans can judge humans, no machinery can be allowed to do it.

https://en.wikipedia.org/wiki/Butlerian_Jihad