Hacker News new | ask | show | jobs
by jldugger 977 days ago
> Say I have a simple table of outdoor temperatures and ice cream sales. What can the machinery of causal inference do for me in this situation?

Not much. Causal inference works over networks of variables, specifically a DAG. But usually you know more than one variable association, so this is more an issue of pedagogy than the tool itself.

Probably the shortest, most persuasive example I can give you is a logical resolution to Simpson's Paradox: when the correlation between two variables can change depending on whether you consider a third variable or not.

The classic example is gender discrimination in college admissions. When looking at admissions rates across the entire university, women are less likely to be accepted than men. But when (in this example) you break that down into departments, every department favors women over men. This is a paradoxical contradiction, and worrying in that your science is only as good as the dimensions your data captures. Worse, the data offers no clean way to say which is the correct answer: the aggregate or the total. Statisticians stumbled for a long while on this, and it's kind of wild that we were able to declare smoking causes cancer without a resolution to this.

Pearl wrote a paper on how bayesian approaches resolve the paradox[1], but it does presume familiarity with terms like "colliders," "backdoor criterion" and "do-calculus." His main point is that causal inference techniques give us the language and tools to resolve the paradox that frequentist approaches do not.

[1]: https://ftp.cs.ucla.edu/pub/stat_ser/r414.pdf

1 comments

When looking at admissions rates across the entire university, women are less likely to be accepted than men. But when (in this example) you break that down into departments, every department favors women over men.

If every department favored women then the entire university would also favor women. Parity is guaranteed in that scenario. What happened in the Berkeley case is that not every department favored women, and women applied disproportionately to the departments with lower admissions rates (including some that didn't favor them), while men did the opposite.

Yes, apologies, what I meant by "favored" was that in every department, women applicants were more likely to get an admission than men. But I'm pretty sure the admission rate can still be lower for women overall than men overall, using exactly the same scenario you described. If the sociology department admits 10 percent of applicants and the physics department admits 90, it seems very easy for gender bias in applications to shift women towards 10 and men towards 90, even if the rate is a few percent higher for women.
I get your point now. You're quite right that you can construct scenarios that arbitrarily favor men in the aggregate but women in specific departments, given the right ratio of applicants.