|
|
|
|
|
by eridius
3549 days ago
|
|
No, I'm claiming that P(crime detected) != r(pp). More police in an area typically means more crime is detected, but that's not the only factor. If you have two areas with identical police presences and identical actual crime rates (as opposed to reported crime rates), the rate of crime detection (as measured by arrests and whatnot) may be higher in one area due to other factors such as racial bias (not just racial profiling, but also things like police letting white people off with a warning where the equivalent black person would be arrested). So you cannot simply correct for this by accounting for the police presence. What's more, your data may not even have the necessary info to figure out if there's a bias. For example, what if police are more likely to arrest someone wearing a red shirt than someone wearing any other color shirt? Unless the color of the person's shirt is part of the arrest report, there's no way your statistical model is going to figure out that red shirts affect arrest rate. |
|
We've now established the existence of a statistical model which can detect this bias.
Now, any other model which is capable of expressing your specific r(p) can do the same thing. The entire purpose of fancy models like random forests is that they can express lots of functions while also being reasonably generalizable.
If you want to claim that this bias is much more difficult to encode in an SVM than all the other typical hidden patterns, you need to establish that your specific r(...) is somehow vastly more complicated than all the other things that machine learning models regularly detect. That's a pretty strong claim.
Interestingly, you are now arguing the exact opposite of what most "machine learning is racist" people claim. They typically claim machine learning is racist because algorithms actually learn hidden factors they wish it wouldn't; e.g., a lending algorithm might "redline" blacks who don't pay back their debts. I take it you believe this is highly unlikely, and algorithms can't possibly distinguish between men and women and then show high paying job ads to more men than women?