Hacker News new | ask | show | jobs
by anton_tarasenko 2671 days ago
"Algorithmic justice" reminded me of a study where researchers predicted the risk of a crime better than judges:[1]

> Millions of times each year, judges must decide where defendants will await trial—at home or in jail. By law, this decision hinges on the judge’s prediction of what the defendant would do if released. This is a promising machine learning application because it is a concrete prediction task for which there is a large volume of data available. Yet comparing the algorithm to the judge proves complicated. First, the data are themselves generated by prior judge decisions. We only observe crime outcomes for released defendants, not for those judges detained. This makes it hard to evaluate counterfactual decision rules based on algorithmic predictions. Second, judges may have a broader set of preferences than the single variable that the algorithm focuses on; for instance, judges may care about racial inequities or about specific crimes (such as violent crimes) rather than just overall crime risk. We deal with these problems using different econometric strategies, such as quasi-random assignment of cases to judges. Even accounting for these concerns, our results suggest potentially large welfare gains: a policy simulation shows crime can be reduced by up to 24.8% with no change in jailing rates, or jail populations can be reduced by 42.0% with no increase in crime rates.

[1] https://www.cs.cornell.edu/home/kleinber/w23180.pdf

3 comments

The authors note that judges may care explicitly about racial bias, but based on a quick read they're making a really, really big mistake in the language they're using: they confuse arrests with crime. Arrests and convictions are simply a measurement mechanism for crime, which is known to have severe biases.
Kleinberg uses arrests for violent crime because they are known to have substantially less bias/zero bias
There is no convincing evidence that arrest rates "severely" overestimate offense rates; if anything it is just as likely arrest rates underestimate offense rates.
While what you say specifically is true, using arrest rates to determine criminal activity by race (and the subsequent conviction rate, etc.) has been shown to have strong relationship to race; at least in the United States. There are entire books on the subject. You can't tie racial arrest rates to the underlying crime rate, as POC get arrested far more often for the same crimes.
Indeed. In a closed society where everyone is guilty the only crime is getting caught.

This concept is the core basis of the war on drugs.

I'm not sure I understand that correctly. Are you saying that POC are acquitted or have charges dropped far more often for the same crimes (i.e. have a far lower conviction rate)?
POC are more likely to be pulled over, then when pulled over more likely to ask to be searched, then when searched more likely to be arrested when situations are similar to non-POC folks.

It doesn't really stop there either, they are more likely to be convicted of the same crimes and then get longer sentences. They are less likely to be offered probation. This eliminates huge percentages of men permanently from POC communities. It is possible that this process can be blamed for the social issues present in the inner city.

"The New Jim Crow" covers a lot more in a lot more detail. I would strongly recommend the read.

In general, POC are more likely to be arrested for committing a crime.

The parent's point isn't about whether they are acquitted, it's that if you were to commit a crime as a POC, you are more likely to be arrested than if you had committed that same crime as a non-POC. In both scenarios you committed a crime, but in one of them the system never has a record of it. This is why arrest rates and crime rates are different: if a POC is more likely to get arrested for committing a crime, the arrest rates by race (POCs get arrested more) will not reflect the crime rates by race (differences are generally smaller).

Most empirical data indicates that white people are arrested and convicted at a higher rate relative to basal offense rate than black people.
Communities of color are over-policed. There's a huge racial divide in income, and crimes that tend to be committed by the poor (like shoplifting, loitering, and fare-evasion) are far more likely to be prosecuted than crimes committed by the wealthy (smoking some weed in your suburban living room, fudging your taxes a bit). The end result is that people of color are more likely to have an arrest record, even if they're just as likely to commit a crime as a white person.
For an example of this backed up by data, the NYPD stop-and-frisk program has always overwhelmingly focused on black and Latino people (almost 90% of all people stopped in some years), even though they make up only 15% of the population of some of the precincts involved AND white people were more individually likely to actually have an illegal weapon (the supposed reason for the stop-and-frisk program to exist).

https://www.nyclu.org/en/stop-and-frisk-data https://www.nyclu.org/en/press-releases/analysis-finds-racia...

Just to add a bit, you can look at the justice system as a binary classifier if you squint hard enough. So it has both false positives and false negatives, both of which are difficult to actually measure.

On the one hand, you have poc arrested and convicted of crimes that wouldn't be charged in other parts of town (arguably false negatives, amongst the non poc). On the other, prosecutors use the plea bargain system to get people to plea out for smaller charges instead of risking decades of their lives at trial; this is an excellent way to produce false positives.

Until the populace learns how to improve their chances of getting released and starts to game the system, introducing endogeneity.

(Also noticed the nice coincidence of a professor with user name klienber having a NBER Working Paper)

Which is totally not possible with judges, right ?

I feel like a lot of arguments being made here fail the A vs B test. Any argument that purports to provide help with choosing Judges vs Algorithms needs to apply differently to Judges, and differently to Algorithms.

How about: with Judges we simply won't know (for sure) what influences them. Are they racist ? Who knows. Do they prefere to let people with jobs out (realistically: yes, but we don't know for sure). Do they ...

With algorithm we can literally test, by presenting them with artificial cases, lots of them, and see how they judge. With a judge, you can't.

> I feel like a lot of arguments being made here fail the A vs B test. Any argument that purports to provide help with choosing Judges vs Algorithms needs to apply differently to Judges, and differently to Algorithms.

Out of curiosity, is there a name for this "fallacy", if it is one, since to me it mostly seems like the other party is failing at some basic level of critical thought.

I've been dealing a lot with arguments of this nature at work, and it'd be great to have a name to it. Pointing it out in the verbatim sense ("ok, but that's true of <your counter position> as well") becomes tiring quickly, and honestly, just causes the person to move on to the next fallacious claim.

Well, humans are already capable of dealing with it. The judges know that prisoners know what is expected of a good prisoner. The decisions are already being made with that in mind.

Contrast this with evaluating a programmer's performance. Everyone knows that lines of code written, number of tickets closed, number of fixed bugs or lines of documentation written do correlate well with performance. But the minute they are revealed to impact performance reviews, those metrics becone trash. Until you can find viable instruments, you shouldn't ever put those into a model and expect to have good predictions. If your model is not explicitly equipped to deal with endogeneity (like structural equation models), it will fail when faced with it.

If you think a judge is influenced by things that are unrelated to the case, you should appeal to the court above (which you can readily do in Continental Europe, but I don't know about Common Law).

> Well, humans are already capable of dealing with it

If that were true, algorithms wouldn't be able to outperform those humans on the metrics that matter.

About your faking metrics issue, the trick in this case is simply taking the metrics that matter and feeding them into an algorithm. Problem solved.

Take metrics:

1) will suspect face justice if released

2) will he reintegrate faster if released

Anyone criminal who wants to game those metrics, well I for one will be applauding that !

Until the populace learns how to improve their chances of getting released

As long as that correlates with behavior we want to see from the populace; https://xkcd.com/810/

>By law, this decision hinges on the judge’s prediction of what the defendant would do if released.

That isn’t the legal standard for release pending trial.

Thus is may look like an algorithm “predicts” crime better than judges, but judges can’t withhold bail/bond because they “predict” a certain defendant will commit another crime (because that’s not exactly judges are doing when determining bond).