Hacker News new | ask | show | jobs
by seanmcdirmid 4402 days ago
Numbers allow us to identify problems. Without numbers, there are no problems, so there are no issues to address; I see how that would solve the problem!
2 comments

> Numbers allow us to identify problems.

The problem is in these sorts of cases the numbers are generally useless. Even if we eliminated 100% of all racial discrimination from society, fewer African Americans would attend college because fewer of their parents can afford to send them, and those type of consequences would carry down for generations completely regardless of continuing racism.

So you say you want to stick with the numbers anyway and try to account for income level. OK boss, that will reduce your confidence interval by a good bit but we can do it. The trouble is poverty is not the only issue. The fertility rates are different. African Americans on average have more children than whites and Asians according to the most recent census (2.1 vs. 1.8), so for the same parental income level the money is split between more children, as is parental time and attention. African Americans are also significantly more likely to grow up in single parent families. That one's 65% for African Americans vs. 23% white and 16% Asian. Ouch. So we have to account for that stuff too. And those all interact. If you have three children being raised by one parent making $30,000/year as compared with two children being raised by two parents each making $50,000/year, expecting to get anything resembling the same results is bonkers.

> Even if we eliminated 100% of all racial discrimination from society, fewer African Americans would attend college because fewer of their parents can afford to send them, and those type of consequences would carry down for generations completely regardless of continuing racism.

Precisely. The issue with numbers is that people will focus on numbers and make the conclusions that "as long as it's not 50/50, it means there is some RACISM at work somewhere" without understanding the underlying causes.

It's ALWAYS the same issue with numbers and statistics: used in the wrong context, you can manipulate them to say what you want to say, instead of using numbers to explain the truth.

That's pseudo science at best.

Experiments through data are by no means impossible when it comes to race or gender. That's the lifeblood of social science. Throwing your hands up and saying "too many numbers! no conclusions could ever be possibly found!" would be completely unacceptable in any other discipline. You're now picking and choosing which fields can even use basic statistics.
> Experiments through data are by no means impossible when it comes to race or gender. That's the lifeblood of social science.

It's also why hard science majors make fun of them.

> Throwing your hands up and saying "too many numbers! no conclusions could ever be possibly found!" would be completely unacceptable in any other discipline.

That's because just about any other discipline is capable of conducting a controlled experiment. The problem with statistics in social sciences is that you don't control anything. You can't just order families of a particular race to stop having more or less children than other races so that you can get a good control group, so you have no control group. You only have data from something you hope is a reasonable approximation of a control group, without even any good way to tell when it isn't.

> It's also why hard science majors make fun of them.

Hard science recognizes social science work when solid data is used and the methodology is well understood and effective. You're generalizing.

> That's because just about any other discipline is capable of conducting a controlled experiment

> You can't just order families of a particular race to stop having more or less children than other races so that you can get a good control group, so you have no control group.

You look at families of a race that had less children and compare them to families of the same race with more children. That would be a data experiment controlled for race. Read Freakonomics if you want to understand data experiments better.

> You look at families of a race that had less children and compare them to families of the same race with more children. That would be a data experiment controlled for race.

That's exactly how you expound the problem and get the wrong answer. How do you know that the factors causing parents to have more or less children are the same between races, or that those factors don't directly impact parenting ability? Suppose the predominant factor in low income Asian Americans having three or more children is a calculated decision that the couple's extended family has enough resources to responsibly raise that number of children (i.e. rich uncle), but the predominant factor in low income African Americans having three or more children is accidental pregnancy.

At first you had to take into account income level, but to do that you have to factor out fertility rate, and to factor that out you have to account for the different causes behind the differing fertility rate. If we then discover that the predominant cause of accidental pregnancy in African Americans is religious opposition to birth control or abortion, don't we have to then account for the causes and consequences of a higher degree of faith in religion?

Nobody has the resources to go all the way down the rabbit hole. But everywhere you look there is some factor that would change the outcome by 50% in one direction or the other if you take it into account. Which means you can make the numbers say whatever you want just by looking in the places you can expect to find support for the result you like.

> That's because just about any other discipline is capable of conducting a controlled experiment. The problem with statistics in social sciences is that you don't control anything.

Statistical controls are real controls, and are frequently used not only in social sciences, but in so-called "hard" sciences for large, complex, or distant systems that can't be conveniently be set up in a laboratory. Laboratory-style control is one particularly convenient mechanism for isolating particular independent variables, but its not a defining requirement of empirical science.

Using statistical controls is far more likely to lead to error because you controlled for three relevant variables when there were three thousand. This is drastically exacerbated by the political consequences of social science. Nobody can really gain any political advantage in publishing experimental results that show an erroneous gravity constant and are immediately disproven by contrary experiments (cf. climate change, the papers denying which are taken seriously by no mainstream scientists), whereas papers purporting to show that racism is or is not still prevalent are the sort of things that get bills passed and politicians elected. The consequence is that publishing a paper in social science that provides support for a politically unpopular conclusion tend to be Very Bad for the careers of the scientists, with political opponents tearing apart anything they might have missed (because papers supporting popular opinion miss nothing?) and otherwise making every effort to discredit them.
> Without numbers, there are no problems

Who said that? Of course you can identify problems without having any number. That's called qualitative understanding.

How does unbiased qualitative reasoning work if counting isn't allowed. Genuinely curious, citations would be helpful.

In science, we call out qualitative reasoning as being biased and unscientific.

> In science, we call out qualitative reasoning as being biased and unscientific.

Ha! I'm a scientist by training, and your claim makes me smile. Most of Science starts by qualitative reasoning and observation. It's because you notice phenomena that you emit hypotheses as to why they occur, and then you design experiments to generate data and verify your hypothesis (i.e. whether your qualitative understanding is correct or not).

Right, we use qualitative reasoning at the beginning and try to temper our biases separately, but how can you do unbiased evaluation without numbers? Even the social sciences has to rely at numbers and statistics eventually.
> but how can you do unbiased evaluation without numbers?

First, collecting data must be made to answer a question. The current way of asking ethnicity based on unvalidated criteria (basically what you identify yourself as) does not mean anything. It's rubbish as data, because there are almost no "pure" individuals in the US anymore, people have been mixed for generations.

The way the current data is used is to reach a political agenda to say that we are in a state of inequality between races and sexes and that the government has to step in to fix things, hence you need the government to spend money and resources on this, etc... It's NOT a scientific study at work, it's data used for political purposes.

Plus, it's not unbiased either because it's not in an observational state. Individuals and companies are aware of these ratios in these companies and know that they are expected to do something about it. That's not science at work, it's social pressure at work.

> It's rubbish as data, because there are almost no "pure" individuals in the US anymore, people have been mixed for generations.

It's not rubbish. You can't simultaneously discuss statistics about black incarceration or female underrepresentation in tech while also denying that such classifications even exist in the first place. The lines blur sometimes, but pretending there are no lines denies reality.

> that the government has to step in to fix things

You're putting the cart before the horse. This is a private company's data, not any specific recommendation for government action.

>Plus, it's not unbiased either because it's not in an observational state. Individuals and companies are aware of these ratios in these companies and know that they are expected to do something about it. That's not science at work, it's social pressure at work.

First, some companies just plain don't care and don't feel any social pressure because their insulated from any real feedback or criticism. Second, any social science work includes some degree of bias because we're not all robots. Saying no possible conclusions can be drawn from demographic data is unscientific and akin to global warming denial.