Hacker News new | ask | show | jobs
by mcswell 1425 days ago
"... evidence that nudging works. Perhaps not up to the standard of science..." That's pretty close to saying it doesn't work. The point of this meta-study was precisely to show that the evidence claimed to support nudging was probably attributable to random variation + unnatural selection, where the unnatural selection was publication choice: either the researchers who got negative (null) results chose not to bother writing it up and submitting it, or papers that reported negative were rejected by publishers.

There are lots of people who do X for a living, but where X doesn't work: palm readers, fortune tellers, horoscope writers, and so on. I'm not even sure that funds managers reliably obtain results much above random.

2 comments

I think what’s not clear is what’s in those papers and what exactly they have to say about nudging and what definition they’re using. It defies credulity to think that changing defaults in software doesn’t change behavior if only because most users aren’t technically savvy enough to change their settings.

On the other hand the dream of nudge theory is something like a study done in the UK that suggests that adding the line “most of your fellow citizens pay their taxes” will increase the likelihood that people pay taxes. This I’d be more likely to believe the benefits are not clear, and more importantly difficult to replicate across time and culture.

It seems that trying to do a meta-analysis on all of nudge theory (or large categories of it) would indeed show know impact. It’s not like you’re testing one thing, you’re comparing well designed programs, with ones that aren’t.

>That's pretty close to saying it doesn't work.

No it's really not.

To say things a different way, I don't think this study will change anything for people actually doing choice architecture in applied settings. They have results that speak for themselves.

> results that speak for themselves.

This is exactly how a midwife explained to me why she uses magic crystals. She told me that there's science, and there's results, and that she's seen the crystals work.

Obviously they don't work by magical vibration, but are you sure they don't work at all? If the midwife feels and acts more confident from having that tool or the mother feels more relaxed because she thinks they will make the process easier, then the crystals do, in fact, work. They just don't work through the mechanism those individuals think they do.
I mean, yeah, if she has solid RCT data on thousands to millions of childbirths and has found a statistically significant impact from using the magic crystals, I would support their use. A/B as well as scientific research uses the same basis.

The issue is that in fact the midwife will not have such data. The comparison being made is that A/B testing, if run competently, is pretty close to scientific research, in particular for research related to nudging.

I wonder how many engineers crack open a statistics book to find the correct test versus just plotting box plots and saying "see looks pretty different"
To be fair, the more profound a result the less math you need to convince anyone it is the case.
Maybe if you are parroting the result in front of investors instead of statisticians that's the case
But if run rigorously, A/B testing is identical to scientific research, and the scientific research fails to show an effect.
The OP was referring to A/B tests that were "perhaps not up to the standard of science", not ones that were already science.
"I don't think this study will change anything for people actually doing choice architecture in applied settings." Probably true, but then evidence that horoscopes etc. don't work, doesn't prevent people from drawing horoscopes, or other people from relying on their horoscope to plan out their day.

"They have results that speak for themselves." Let me put my point differently. Suppose that nudges don't have any effect at all (null hypothesis). More concretely--and just to take a random number--suppose that 50% of the time when a nudge is used, the nudgees happen to behave in the direction that the nudge was intended to move them, and 50% of the time they don't move, or they move in the opposite direction. And suppose there are a number of nudgers, maybe 100. Then some nudgers will get better than random results, while others will get no result, or negative results. The former nudgers will have results that appear to speak for themselves, even if the nudges actually have no effect whatsoever.

This is the same as asking if a fair coin is tossed ten times, what is the probability that you'll get at least 7 heads. The probability of such a number of heads in a single run is ~17%. So 17% of those nudgers could be getting apparently significant results, even if their results are actually random.

I think gp and you probably see eye to eye, but gp has a problem with your phrasing. If the effect does not live up to scientific rigour, that (more or less) implies that the effect is roughly indistinguishable from randomness.

If folks have results that speak for themselves, then the effect more than likely is scientifically rigorously testable. It may already have been - by those very results.

They would be the people who published, in this scenario.