Hacker News new | ask | show | jobs
by YeGoblynQueenne 2670 days ago
So, this is the data that the wikipedia page on Simpson's Paradox cites for the Berkeley study, and that the author of the article has quoted:

                     Men              Women
    Department Applied  Admitted Applied  Admitted
    A          [825]    62%      108      [82%]
    B          [560]    63%      25       [68%]
    C          325      [37%]    [593]    34%
    D          [417]    33%      375      [35%]
    E          191      [28%]    [393]    24%
    F          [373]    6%       341      [7%]

Above, I've bracketed in each pair of columns a) the sex with the most applicants and b) the sex with the most admissions, in a department. If that data is really the Berkeley data, then it's clear that the bias is against the sex with the most applicants, rather than either men or women.

I can propose a mechanism for this kind of (with some abuse of terminology) selection bias. A department accepts some applications, then realises they've admitted too many applicants of one sex and start rejecting applicants from the dominant sex in an attempt to redress the balance. They make a mess of it and end up biased too far in the opposite direction than they originally started.

Also note that in 4 out of 6 departments, more men applied than women, explaining why more departments appear biased against men (provided my observation holds).

However, I can't be sure whether this is actually the original data because it's nowhere to be found on my pdf copy of the study (Sex bias in graduate admission) which I believe I got from here: https://homepage.stat.uiowa.edu/~mbognar/1030/Bickel-Berkele.... If anyone knows where this data actually comes from, I'd welcome a pointer.

2 comments

As a separate comment, which might be controversial, I would like to call bullshit on the entire claim of the Berkeley study in particular (and not about Simpson's Paradox in general). In the "Berkeley data" (if that's what it is), it's clear again that men applied to most departments in larger numbers than women. The Berkeley data claims that because more women were admitted on a per-department basis, more departments were biased against men.

Now, picture this. Alice and Bob share a pizza. Alice takes 7 pieces and Bob takes 3 (he's on an intermittent fasting diet so he only eats every other slice). Alice eats 4 of her slices, Bob eats 3 of his. At the end, Alice turns to Bob and says "boy, you're such a glutton! You scoffed down all of your slices, but I still have 3 left".

Is that a fair comparison? Well, no. Alice starts out with almost double the slices than Bob. Bob eats less than Alice, but he's accused of stuffing his face because he eats a larger proportion of his smaller share.

Same with the Berkeley data. If that is the Berkeley data.

I'm not quite sure I follow your complaint, but I think I might be disagreeing with you. A key lesson of Simpson's Paradox is you can't read stories into data without having a causal model derived from outside the data.

I can comfortably invent stories that are not inconsistent with the data for a wide range of scenarios:

1) Only the most capable women are applying to Dept A due to discrimination, so the data is evidence of discrimination.

2) Dept A is discriminating towards women (self evident, 80% vs 60% admissions).

3) Dept A is completely non-discriminatory and the assessors are unaware of the gender of applicants; the differences are due to personal choices w.r.t. education and social networks turning out to be proxies for gender.

No study this sort of data can detect gender bias. It can be used as evidence in a broader study that comes up with a causal model for how the admissions process works; but there is no getting around interviews and field observations.

I'm not challenging Simpson's paradox, only the conclusion quoted in respect with the data in the above table (I'm still not sure where it came from).
You need to look at the figures. The differences that support your argument are minor and within the margin for error. You could similarly concluded that women are just smarter across the board.
I'm sorry, I don't understand your comment. What difference is minor? What is the margin for error? And how would I conclude what you say?
Men are only favourites by 1-2%. That's within the margin if error. Women are favourites by say 10% plus. The comment treats them the same, and base their theory on a binary concept. It's just bad logic and may even be a version of the Simpson paradox.
Women are the favorite by 10%+ only for a single department. This is a _different_ fallacy, now...
I still do not understand. How are men "favourites by 1%-2%" and women "by 10% plus"? Favourites, for what?

And how did you calculate the margin of error for this study?

First each subject you compare the chance of admission. For men when they have a higher chance of admission, even in their most advantaged subject they have a higher chance of admission of 4%. Women on the other hand have a 20%. You can't say that they are equivalent in the least. In terms of error margins, a few percent is common, from experience. You could do a stats 95 confidence style calculation.
You're talking about the difference between the percentages of applicants of each sex that were admitted. I tabulate:

                  Men              Women              % Difference
    Department Applied  Admitted Applied  Admitted    Men     Women
    A          [825]    62%      108      [82%]               +20%
    B          [560]    63%      25       [68%]               +5%
    C          325      [37%]    [593]    34%         +3%
    D          [417]    33%      375      [35%]               +2%
    E          191      [28%]    [393]    24%         +4%
    F          [373]    6%       341      [7%]                +1%

So, there's a 20% difference for one department that is a clear outlier and then everything is within a couple of percentiles of difference. In fact, the average difference is higher for men (3.5) than for women (2.666) ignoring the outlier, since it's an outlier.

However, I'm really not sure that taking the difference between proportions of different wholes is meaningful. The numbers don't add up to 100, so what does the difference mean, exactly?

I don't know what "a stats 95 confidence style" is, or how it is related to a margin of error, so please do that calculation and post your results.