Hacker News new | ask | show | jobs
by unlinked_dll 2382 days ago
* housing, employment, and credit Ads

I think the easiest solution would be to disallow ads of those categories on their platform. I'd think the risk of "facebook/instagram is racist" damaging their brand and the cost of federal discrimination lawsuits would outweigh whatever revenue they project.

As an aside, I know it's faux pas to bring up any observed (and/or presumed) differences between the protected classes - but maybe (just maybe) Facebook's targeting is smart enough to correlate "most likely to care" about things that tend to have skewed demographics without looking at the demographic data itself. Like the example in of truck driver ads targeting men, what is Facebook using to determine who they target? And do those data points line up with demographics?

I don't know, but these kinds of systems are tough to introspect from the outside.

3 comments

Your aside is pretty much dead-on the big ethical issue with bias in ML right now.

For example, ML can do quite a good job of predicting recidivism rates in convicts, and justice systems have been using this to aid in sentencing and parole hearings. Obviously, these ML approaches are not supposed to consider ethnicity. So the factor that ends up having the greatest weight is "did your father / grandfather spend time in prison", which is an extremely effective proxy for "are you not white".

Basically, when your training data is based on a reality already heavily influenced by bias, your models will end up reflecting and perpetuating that bias.

The real problem is that there is an actual racial disparity in recidivism rates, so an algorithm that makes accurate predictions will predict the racial disparity that actually exists. There is no way to solve that without significantly impairing the accuracy of the predictions -- which is to say releasing convicts who we know have an unreasonably high probability of recidivism merely because there were too many other convicts with an unreasonably high probability of recidivism who were the same race.

You can also imagine what happens if you apply this recidivism "adjustment" to gender, which causes a lot of the people advocating it in the case of race to become nervous and defensive.

Accuracy is not the top objective in these systems, fairness is.
In this example, what is fairness, if not the most accurate prediction possible?
Especially when you consider fairness to the community at large. Is it fair to black neighborhoods if we send proportionally more expected recidivist drug dealers and rapists back into their communities than we do to white communities?
Fairness is judging a case based on its merits, rather than correlations between other dimensions that are connected with systematic bias.

I should be judged based on the interpretation of my situation, not because someone who lives in a similar neighborhood was previously a bad bet.

Just to start with:

1. Not punishing someone for the sins of their family.

2. Not punishing someone for the unfair treatment that their family suffered in the past.

The effect of its use on policy.
That is incredibly vague. What effect and what policy?
most of the standard metrics of fairness for machine learning don't just just try to equalize proportions of positive/negative labels. they look at error rates.

under these measures of fairness, a perfectly accurate predictor is regarded as perfectly fair, regardless of a disparity in base rates in the two populations.

some of the predictive policing models still fail under these metrics -- they are more prone to make errors on black defendants.

> under these measures of fairness, a perfectly accurate predictor is regarded as perfectly fair, regardless of a disparity in base rates in the two populations.

Unless your predictor is perfectly accurate, the errors will be proportional to the base rate. If you're predicting that more X will do Y then you have more chances to be wrong.

Improving accuracy is the only real way to reduce the error rate. If you can't do that then you're left with malicious nonsense like fudging the base rate, which is just trading false positives for false negatives and not actually making anything better.

That "smart"ness you describe is almost definitely what the algorithm's doing. And this is all well and good, until you take into consideration the fact that these systems tend to amplify existing biases.

If it spots a correlation, it'll amplify the correlation, regardless of whether it's actually meaningful. There are probably some originally spurious correlations that Facebook has amplified into existence, given how big and all-encompassing it is. It's the same problems that lead to racist AI judges.

I suspect it’s going to become a massive shakedown racket over the next decade; groups will go to tech companies and allege their algorithms are racist/sexist/whatever, and keep complaining until paid to go away.