Hacker News new | ask | show | jobs
by 78666cdc 3692 days ago
It's clearly not a random sample of the population.
2 comments

For most psychological research you don't need a very random sample, nor do you need a very large sample. It's got to do with the fact that our brains largely function the same. Your sample has to be large and random enough to be fairly confident that you didn't accidentally pick a significant amount of people with some sort of mental divergence (i.e. very low IQ, very high IQ, autism, psychopathy, etc..) that could be relevant.

I don't think picking nearly 200 people from Amazon Turk is going to risk that. But perhaps someone should research what kind of people are on Amazon Turk.

Fun exercise, let's modify the conclusion to reflect the specific population bias we think the paper might have:

People working for Amazon Turk are more likely to support conservative ideas when they also are susceptible to bullshit.

What is a much more important question: What is the distribution of conservative/non-conservative supporters in the group, and how significant is the measured correlation?

edit: Just as a disclaimer, I made this comment based on the psych master thesis presentation I've attended of a friend when I was at university where I posed the same question (N was only 16, and only local students were surveyed). He explained to me that for his particular research (pertaining correlation between auditory senses and motor skills) a small group was sufficient because of the fundamental brain structure we (mostly) all share. Whether that holds up for more complex research like this I don't know, I'm not a psych student.

The view you're espousing is wrong. When you actually test if what you're calling human universals actually are universal, you generally find they're not:

"Broad claims about human psychology and behavior based on narrow samples from Western societies are regularly published in leading journals. This review suggests not only that substantial variability in experimental results emerges across populations in basic domains, but that standard subjects are in fact rather unusual compared with the rest of the species - frequent outliers. The domains reviewed include visual perception, fairness, categorization, spatial cognition, memory, moral reasoning and self-concepts. This review (1) indicates caution in addressing questions of human nature based on this thin slice of humanity, and (2) suggests that understanding human psychology will require tapping broader subject pools. We close by proposing ways to address these challenges."

(The thin slice of humanity referred to above is westernized college students. Perhaps mechanical turkers do not suffer from this bias? seems unlikely though)

http://www2.psych.ubc.ca/~ara/Manuscripts/Weird_People_BBS_H...

also see http://www.scientificamerican.com/podcast/episode/psychology... and http://www.slate.com/articles/health_and_science/science/201... for pop write-ups.

Thanks! That's interesting, seems I might have had it wrong. I should've brought that up during my friends master thesis presentation, maybe he wouldn't have got his diploma ;) Are psychology researchers in general aware of this review? Obviously my friends master thesis was just a small inconsequential study, but if the universals that are now commonly used have turned out to not actually be true wouldn't that mean huge swathes of researchers have to retract/redo their research?
> but if the universals that are now commonly used have turned out to not actually be true wouldn't that mean huge swathes of researchers have to retract/redo their research?

Reproducibility is a major concern. Many experiments are poorly designed.

http://science.sciencemag.org/content/349/6251/aac4716

So, yes, about 60% of research needs to be redone.

Even the paper itself acknowledges the weaknesses of the sample. It's much easier to make "based on consistent responses in our sample, humans generally behave in a certain way" generalisations from a small, unrepresentative sample than to conclude "this cohort of humans we've unrepresentatively sampled generally behaves differently from that cohort of humans we've unrepresentatively sampled"

Consider the hypothesis that (for the sake of simplicity) half of conservatives support conservative candidates because they're "susceptible to bullshit" and half of conservatives support them because they're hardworking professionals who believe those candidates' brand of fiscal conservatism is most aligned with their rational self interest in not being taxed too highly.

Which of these two categories of Trump and Cruz supporters would be underrepresented on AMT, a place for underemployed people to find menial work?

>For most psychological research you don't need a very random sample, nor do you need a very large sample.

Heck, as long as you get published and get a grant, any size or representativeness of sample will do (even straight made up statistics, if there's no big fear of being found out).

  you don't need a very random sample, nor do you
  need a very large sample. It's got to do with the
  fact that our brains largely function the same.
Do you think, if I did a study of political leanings among college undergraduates, that I'd get the same results as a study of the general population because "our brains work the same" ?
This is not a study of political leanings of people. It's a study of the correlation between a characteristic of a person and their political leaning.

If you take 200 college undergraduates, let's say 80% of them are lefties. That means you will have 160 lefties and 40 righties. Then of the 160 lefties 50 believe in BS (31%) and of the 40 righties 16 (40%) believe in BS, you can take those numbers to your statistician and ask him whether that spread is significant.

Just the idea that the correlation is there for these college students, where you might not expect it would be there for college students is a hint that maybe this could hold up for the population in general. At the least it could warrant for a larger investigation.

There's some rather more fundamental issues with the paper.

Let's do a little bit of a dive...

1. The "BSR" is 10 questions on a Likert scale with extraordinarily vague labels. So, what's the difference between "somewhat profound" and "fairly profound"? How confident are you that different populations (eg lib v con) will have similar views on the difference between "somewhat" versus "fairly"?

2. Liberal/conservatism meanwhile is a single question on a Likert scale. 1 to 7, how conservative are you? So, self-image not actual conservatism. And given 109 participants rated themselves on the liberal side versus 46 on the conservative side, it's going to be dominated by "just how extreme do you think your liberalism is?"

3. But best fun of all -

On the left, we have 1 = liberal 1 = not at all profound 1 = not at all favourable

On the right side of their questions we have 7 = conservative 5 = very profound 5 = very favourable

109 participants were Liberal (less than 4 on lib/con Likert item) 46 participants were Conservative (above 4 on lib/con Likert item)

So, just the factor of "how much do you like to tick the extreme boxes on a Likert scale" would give a correlation like the one they get.

More likely to pick a 1 than a 2 on a Likert item? You'll rank as both more liberal and less receptive to bullshit then... Like to leave a radio box on the left so you don't feel extreme? That'll register you slightly more conservative and slightly more receptive then...

And as the participant pool is 2:1 liberal:conservatism, then that extremeness factor will produce candidate correlations like the ones they get too. (More extreme-tickers are likely to be going for 1s on lib, 1 not profound, 1 not favourable of Republicans, and 5s on favourability of Democrats. Middle-of-the-road tickers are likely to be going 2s for lib, 2s for profound, 2s for Reps, and 4s for Dems they like. Higher score for bullshit receptivity, less liberal, less favourable of Dems, and more favourable (less unfavourable) of Republicans.

MeanMundane is bang on the middle (3.1 mean), neatly unaffected by "extremeness" factor, whereas MeanBullshit isn't (2.6), so extremeness will push out the "controlled" correlations neatly too.

(Yes, I'm procrastinating, and had a brief back-of-the-envelope poke around their CSV of data...)

I see this as far more sociological than psychological, where the standards for sufficiently large and random samples are much higher. It's wrapping too many particularities of todays politics to be immune to failures in fair distribution outside IQ and mental illness.
Most people using AMT are people with lower/limited incomes, see r/mturk
... and my guess is if they had started with the standard approach (recruit university students) their results wouldn't replicate.