Hacker News new | ask | show | jobs
by not2b 2105 days ago
This guy overstates his case somewhat. Consider:

"If the original study says an intervention raises math scores by .5 standard deviations and the replication finds that the effect is .2 standard deviations (though still significant), that is considered a success that vindicates the original study!"

Why the exclamation point here? The replication study isn't magically more accurate than the original study. If the original paper finds an 0.5 standard deviation effect and the replication study finds an 0.2 standard deviation effect, that increases our confidence that a real effect was measured, but there's no reason to believe that the replication study is more accurate than the original study. Maybe the true effect is less than measured, but maybe not. So yes, it should be considered a success.

7 comments

When I advise decision makers on reading statistics (in my case, state-wide health data), I urge them to focus on effect size and only use significance as a filter. Two reasons:

1. Effect size is the most important thing. The point of the study is (usually) to guide decisions. Sticking with the article's example, let's say combining both studies shows the increase is likely 0.35 standard deviations. Is the intervention still worth the cost? Is it still the best option?

2. If there's enough data (e.g., an observational study) or a good chance of omitted variables, there's going to be a "statistically significant" difference. No matter what's measured. I would bet my life's savings there's a statistically significant difference in profits of New York businesses depending on whether the owner's named Jim or Bob. A replication of the experiment with all Jim and Bob businesses in another state would also guarantee significance. So it's a coin toss whether the second study would "successfully replicate" the same direction of effect.

I think his point here is that the effect in replication is closer to 0 than to the original claim. It might be more obvious if he chose an order of magnitude difference as an example - going from the dominant factor to technically-not-nothing might be replication but it's not vindication.
They reported two facts, statistical significance and effect size, one of which was not replicated. Of course this doesn't prove anything definitively, but it still arouses suspicion. As for whether it's exclamation-point-worthy, at worst it depends. For example, 0.5 standard deviation improvement in math scores might normally require a ton more expensive effort or even have been thought previously impossible, while a 0.2 was easy and not publication-worthy.
If someone is selling you on rethinking an entire discipline they are probably overstating their case a bit. Or you’re ignoring them outright.

That doesn’t mean they’re wrong, necessarily. Overcoming inertia is a huge challenge. Daunting, even.

Agree. In the face of institutional inertia - here crossing industry and academic fields - your starting point is when I see a fly I use a canon. It is extremely difficult to budge. As so the usual reminder: the first 51% of communication is repetition
> the first 51% of communication is repetition

Is this a famous saying? It sounds nice but I hadn't heard it before. (And Google doesn't pop up anything obvious.)

That's my own turn of phrase.
I have found that the repetition usually only works if it comes from multiple sources. It's the primary reason I encourage the 'new guy' to take their "this doesn't make sense" questions up the chain of command or to the groups we collaborate with.

One, I hate to crush a spirit, although I'm just sending them to someone else to do it. Two, about one time in ten they come back with a preliminary roadmap to a solution.

I can tell Steve until I'm blue in the face that this code is nuts and get nowhere, or I can send three other people to tell him once and finally he'll take it seriously.

I don't think it's malicious, I think it's a combination of basic human psychology with learning strategies. Sometimes you have to shop for new instruction because the perspective your teacher brings simply doesn't resonate. Each person reinforces the neural connections and phrases it a little differently (perhaps especially with new people, because they are fresh off the street and don't use our jargon yet?)

It depends on how the replication is done, but the big replication projects typically use very large sample sizes compared to the original studies, so their error bars are much smaller.
Indeed, and even if the replication isn't significant, it doesn't mean that the replication and the original study significantly differ from each other.

Overall, the condescending trash talking in this article led me to flag it.

Could you speak a little more on that point?
bruh
I'm sorry, but this is such a ridiculous counter argument I'm speechless.

An explanation point as criticism?

>How dare he wear tweed, his argument is invalid.

>The replication study isn't magically more accurate than the original study. If the original paper finds an 0.5 standard deviation effect and the replication study finds an 0.2 standard deviation effect, that increases our confidence that a real effect was measured

It also increased our confidence that the effect is small enough to be ignored. You can't pretend that the two studies are independent from each other. The second is directly the result of the first and you need to use Bayesian methods to calculate your belief of the result. The questions of 'is there an effect' and 'the effect size is >= 0.5 sd' give you two vastly different probabilities and vastly different policy responses.

> An explanation point as criticism

As in "eats, shoots, and leaves", a little punctuation can totally change the meaning of a sentence. In this case, a period would have expressed agreement while an exclamation point expresses incredulity.

!