Hacker News new | ask | show | jobs
by Dewie 4230 days ago
> Well, the sample is quite small,

Regards,

Peanut gallery on every article on some research

1 comments

Yes, it's time we stop reflexively dismissing every study with a small sample size as if that made it worthless. This is the latest meme in a sequence that includes "correlation is not causation" and "the plural of anecdote is not data". It's a generic dismissal and has a de-interesting effect on discussion. It has more in common with internet reflexes like "First" and "Betteridge" than with reflective thought.

One way to catch oneself before doing this is to ask if you're tacitly assuming that the people doing the work are idiots. The odds—and the Principle of Charity [1]—suggest they're not. Comments that imply this generically are usually low-quality.

If the people doing the work really are dumb, then it almost certainly has more specific flaws (e.g. "this way of measuring glucose isn't reliable", to make something totally up) that it would be far more helpful to point out.

1. http://en.wikipedia.org/wiki/Principle_of_charity

I have to strenuously disagree with you (respectfully, of course) on this issue, dang. My several years of participation in the University of Minnesota journal club on behavior genetics, in which we go over current papers, sometimes being amazed by the papers (especially the papers on the methodology of doing science right) and sometimes picking apart the papers line by line (especially the papers on social psychology), have taught me to the core that most scientists are doing a job narrowly conceived of as trying to get published. I think they mostly went into science because they also desire to seek truth and inform humanity, but institutional pressures sometimes keep them from achieving that ideal. A surprisingly huge percentage of working scientists, as John P.A. Ioannidis[1], Uri Simonsohn,[2] Peter Norvig,[3] and special issues of science journals[4] and methodological blogs[5] remind us, are actually working right at their own level of competence or even beyond their own level of competence in setting up experiments and observational studies to be validly generalizable. I do NOT assume that "that the people doing the work are idiots," but I do assume, based on huge numbers of historical and current events examples, that the people doing the work are fallible and subject to cognitive biases (as I am) and subject pressure to publish to stay funded. Moreover, announcements of results are subject to wishful thinking on the part of a university press office.

Until a preliminary study has been replicated, I don't take it to be a statement of facts about the world. Even Wikipedia, which accepts some very dodgy user-submitted content, declares a content guideline that Wikipedia articles should be based on SECONDARY sources[6] (that is, sources by authors who have thought about and digested the primary research findings) rather by preliminary primary research findings. (Of course, for lack of enforcement, many Wikipedia articles break this rule.) It's especially important to establish high standards of sourcing for statements about human health and medicine and nutrition.[7]

I genuinely think that many (too many) readers of Hacker News have no idea what an adequate sample size would be, for a given effect size, to validly infer from a preliminary study result a statement about the entire population. We should be talking about sample sizes all the time here (I agree, with more sophistication and nuance than we often do) as part of educating ourselves about basic science methodology in this community of intellectual discussion.

That said, I heartily agree that "Betteridge's Law" is a useless Internet meme, even though it was popularized here by our esteemed site founder pg. We can do better, and we can raise the level of discussion here. I cherish the participants here who can speak knowledgeably about experiment design, about effect sizes, about observational studies as contrasted with experimental studies, and so on. I also delight when participants here share links to the prior scholarly literature, and especially when something is submitted here that is a better source than a press release.[8] Besides decrying crap, I like to applaud thoughtful discussion, so I regularly upvote comments that point us beyond the headlines to what issues researchers have to grapple with as they try to figure out the complexity of the world.

[1] https://med.stanford.edu/profiles/john-ioannidis?tab=publica...

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1182327/

[2] http://opim.wharton.upenn.edu/~uws/

[3] http://norvig.com/experiment-design.html

[4] http://pps.sagepub.com/content/7/6/528.full

[5] http://retractionwatch.com/

[6] https://en.wikipedia.org/wiki/Wikipedia:Identifying_reliable...

[7] https://en.wikipedia.org/wiki/Wikipedia:Identifying_reliable... [8] http://www.phdcomics.com/comics/archive.php?comicid=1174

I feel like the point may be getting lost here. If you're poring over a paper line-by-line, you're engaging with that specific material and are in a position to point out specific things about it. That's not a problem. The problem is generic dismissals—the ones that don't take work or thought. Those dilute discussion, especially when they're an internet reflex as this one is becoming.

Most published studies may well be false, but that's no basis for substantively discussing a specific one, any more than "most movies are bad" is a movie review.

The problem, dang, is that some of the submissions here are generic submissions (health is a chronic topic here, because we all desire to be healthy), and not submissions based on a line-by-line reading of a few different sources to see which one is worthy of discussion here. A university press release about a small-n preliminary research finding rarely deserves more dismissal than noting it is a publicity puff piece about an unsettled conclusion.

We only discuss what's submitted here. The first filter that distorts reality here is what never gets submitted, because it is thoughtful and nuanced and takes too much thinking to read and discuss.

> I have to strenuously disagree with you (respectfully, of course)

Try not to trip on those eggshells.

It's not just a meme. There have been multiple articles and papers the last few years on how a great number of clinical trials are statistically underpowered and therefore suspect; this includes a 2005 paper entitled "Why Most Published Research Findings Are False", which analyzes "49 of the most highly regarded research findings in medicine over the previous 13 years" and finds that a worrying number of them hasn't stand replication.

And when people talk about underpowered studies, it's mentioned less than 100 individuals, not less than 20.

Oh I know. You're preaching to the choir, believe me! But that's not the meme. The meme is invoking it to dismiss a study with no further thought and no specific content.

Small samples get overinterpreted, many published findings are false, etc.—it's all true and the community here is well aware. But it's a gross overreaction to dismiss small-sample experiments wholesale. Even a few seconds' reflection is enough to see that.

For example, assuming this experiment was rigorous, it didn't need more than 16 subjects to find that greater saturated fat in diet doesn't automatically cause greater fat in blood. Additional resources might be better spent on future samples (i.e. replicating the finding by other researchers) than on a larger single sample, which is probably a game of rapidly diminishing returns. And so on.

The point is that HN wants reflective discussions, not reflexive ones. It takes no work and no real thought to pick out one detail that people are currently primed to fuss over and make a post of it. That's not reflection, it's habit, and its payload is not learning, but reinforcement. Reflection requires engaging with the material—this specific material.

There are two ways to do that. One is to dig in and learn the material, think about it, and report your findings to HN. The other is to happen to know something about it in the first place. The first takes work, the second luck. Comments based on neither work nor knowledge are likely not to be substantive. That's why want to avoid generic dismissals as opposed to specific ones.

They only analysed 49 research findings? Pah! Far too few to draw any real conclusions :)
Keep in mind that the clearer an effect is, the smaller the sample size has to be.