| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bunderbunder 739 days ago
	Also this idea that bigger is better with sample sizes can lead to problems on the other side, when we see people assuming an effect must be real because the sample size is so large. The problem is, sample size only helps you reduce sampling error, which is one of many possible sources of error. Most the others are much more difficult to manage or even quantify. At some point it becomes false precision because it turns out that the error you can't measure is vastly greater than the sampling error. Which in turn gets us into trouble with interpreting p-values. It gets us into a situation where the distinction between "probability of getting a result at least this extreme, assuming the null hypothesis" and "probability the alternative hypothesis is false" stops being pedantic hair-splitting and starts being a gaping chasm. I don't like getting into that situation, because, regardless of what we were all taught in undergrad, scientific practice still tends to lean toward the latter interpretation. (Except experimental physicists. You people are my heroes.) For my part, the statistician in me rather likes methodologically clean controlled experiments with small sample sizes. You've got to be careful about how you define "methodologically clean", of course. Statistical power matters. But they've probably led us down a lot fewer blind alleys (and, in the case of medical research, led to fewer unnecessary deaths) than all the slapdash cohort studies that we trusted because of their large sample sizes that were so popular in the '80s and '90s.

2 comments

devbent 739 days ago

Diet studies can also fall into a similar trap.

Huge sample size, but all food intake is self reported, or a tiny sample size where test subjects were locked into a chamber that measures all energy output from their body while being fed a carefully controlled diet.

The later is super expensive, but you can be pretty confident of the results. On the flip side it also miss any conditions that only present in a small % of the population.

You can see this with larger dietary studies where out of 2 cohorts of 100 each doing different diets, 15 or 20% on each group does really well on some "extreme" diet (e.g. Keto) but the group on average has no unexpected results.

If your sample size is 5, it is quite possible none of your test subjects are going to be strong responders to, for example, keto.

So then the study deadline comes out "Keto doesn't work! Well controlled expensive trial!"

Meanwhile the large cohort study releases results saying "on average Keto doesn't work".

But in reality, it works really well for some % of the population!

Some non-stimulant ADHD drugs have a similar problem. If a drug only works for 20% of the population, you need to be aware of that when doing the study design.

link

bunderbunder 739 days ago

You seem to be implying that subgroup analysis never happens?

I guess I don't follow weight loss research closely, but I would be genuinely amazed that they don't do it, too, given how ubiquitous it is everywhere else in medical science. And the literature on ketogenic diets goes back over a century now, so it's hard to imagine nobody has done one. Could it be instead that people did do the subgroup analysis, but didn't find a success predictor that was useful for the purposes of establishing medical standards of care or public health policy? Or some other wrinkle? Or maybe people are still actively working on it but have yet to figure out anything quite so conclusive as we might wish? But that this nuance didn't make it into any of the science reporting or popular weight loss literature, because of course it didn't, details like that never do?

Disclaimer, I'm absolutely not here to trash keto diets in general. I have loved ones who've had great success with such a diet. My concern is more about the tendency for health science discussions to devolve into a partisan flag-waving contest where the first useless thing to get chucked out the window is a sober and nuanced reading of the entirety of the available body of evidence.

link

devbent 739 days ago

> Could it be instead that people did do the subgroup analysis, but didn't find a success predictor that was useful for the purposes of establishing medical standards of care or public health policy?

If we are all being generous with assumptions, this could very well be the reason.

I haven't seen much research on efforts of trying to predict what dietary interventions will most effective an individualized treatment basis, but I also haven't kept up a literature for five or six years.

Then again the same promises for ADHD medicine where now they are some early genetic studies showing perhaps how we could guide treatments, but the current standard of care remain throw different pills at the patient and see what they works best with the fewest side effects.

Of course dietary stuff is complicated due to epigenetics, environmental factors, and gut microbiomes.

That said progress is being made and the knowledge we have now is world's different than the knowledge we had 20 years ago, but sadly it seems outcomes for weight loss are not improving.

link

MajimasEyepatch 739 days ago

That's a great point. If your experimental methodology is flawed, it doesn't matter how big your sample size is. A study like this lets you gather some compelling evidence that you may have a real effect. Then you can refine the technique. Autism is a very active area of research, so I suspect we'll see other groups attempt to replicate this study and adapt its techniques while the original authors refine the technique and get funding to perform larger studies.

link