Hacker News new | ask | show | jobs
by Mangalor 4402 days ago
Experiments through data are by no means impossible when it comes to race or gender. That's the lifeblood of social science. Throwing your hands up and saying "too many numbers! no conclusions could ever be possibly found!" would be completely unacceptable in any other discipline. You're now picking and choosing which fields can even use basic statistics.
1 comments

> Experiments through data are by no means impossible when it comes to race or gender. That's the lifeblood of social science.

It's also why hard science majors make fun of them.

> Throwing your hands up and saying "too many numbers! no conclusions could ever be possibly found!" would be completely unacceptable in any other discipline.

That's because just about any other discipline is capable of conducting a controlled experiment. The problem with statistics in social sciences is that you don't control anything. You can't just order families of a particular race to stop having more or less children than other races so that you can get a good control group, so you have no control group. You only have data from something you hope is a reasonable approximation of a control group, without even any good way to tell when it isn't.

> It's also why hard science majors make fun of them.

Hard science recognizes social science work when solid data is used and the methodology is well understood and effective. You're generalizing.

> That's because just about any other discipline is capable of conducting a controlled experiment

> You can't just order families of a particular race to stop having more or less children than other races so that you can get a good control group, so you have no control group.

You look at families of a race that had less children and compare them to families of the same race with more children. That would be a data experiment controlled for race. Read Freakonomics if you want to understand data experiments better.

> You look at families of a race that had less children and compare them to families of the same race with more children. That would be a data experiment controlled for race.

That's exactly how you expound the problem and get the wrong answer. How do you know that the factors causing parents to have more or less children are the same between races, or that those factors don't directly impact parenting ability? Suppose the predominant factor in low income Asian Americans having three or more children is a calculated decision that the couple's extended family has enough resources to responsibly raise that number of children (i.e. rich uncle), but the predominant factor in low income African Americans having three or more children is accidental pregnancy.

At first you had to take into account income level, but to do that you have to factor out fertility rate, and to factor that out you have to account for the different causes behind the differing fertility rate. If we then discover that the predominant cause of accidental pregnancy in African Americans is religious opposition to birth control or abortion, don't we have to then account for the causes and consequences of a higher degree of faith in religion?

Nobody has the resources to go all the way down the rabbit hole. But everywhere you look there is some factor that would change the outcome by 50% in one direction or the other if you take it into account. Which means you can make the numbers say whatever you want just by looking in the places you can expect to find support for the result you like.

> That's because just about any other discipline is capable of conducting a controlled experiment. The problem with statistics in social sciences is that you don't control anything.

Statistical controls are real controls, and are frequently used not only in social sciences, but in so-called "hard" sciences for large, complex, or distant systems that can't be conveniently be set up in a laboratory. Laboratory-style control is one particularly convenient mechanism for isolating particular independent variables, but its not a defining requirement of empirical science.

Using statistical controls is far more likely to lead to error because you controlled for three relevant variables when there were three thousand. This is drastically exacerbated by the political consequences of social science. Nobody can really gain any political advantage in publishing experimental results that show an erroneous gravity constant and are immediately disproven by contrary experiments (cf. climate change, the papers denying which are taken seriously by no mainstream scientists), whereas papers purporting to show that racism is or is not still prevalent are the sort of things that get bills passed and politicians elected. The consequence is that publishing a paper in social science that provides support for a politically unpopular conclusion tend to be Very Bad for the careers of the scientists, with political opponents tearing apart anything they might have missed (because papers supporting popular opinion miss nothing?) and otherwise making every effort to discredit them.