Hacker News new | ask | show | jobs
by spwa4 3 days ago
Because correlation is not causation. If A and B correlate there's 3 options:

1) A causes B

2) B causes A

3) C causes both B and A (in some order)

4) your correlation figure is bullshit (hence not counted in the 3 options, but certainly with news these days, it must be mentioned)

A famous way to illustrate where this goes wrong is to show a map which libraries that loaned out Harry Potter books, and a map of where poodles got raped. Very high correlation, and obviously an example of the 3rd option.

(obviously both were caused by population density, which leads to both library creation and poodle-related crimes. And probably non-poodle-related crimes)

1 comments

The 5th option is random chance.

That often results from p-hacking. In a world of infinite variables, if you look hard enough you are guaranteed to eventually find two completely unrelated variables that correlate with each other over a statistically significant period of time.

That's the 4th option
I guess it could be? I interpreted what the parent commenter wrote like "the variables aren't actually correlated" (which definitely does happen sometimes)

Whereas my point is moreso when, the variables really are correlated but it's purely due to random chance. Not bullshit, per se, just bad luck (or possibly, p-hacking).

(Though the solution to both is the same - you shouldn't trust a study until it's been independently replicated on new data.)