| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mcguire 2671 days ago
	I'd like to say that the author has been reading The Book of Why, but it seems that he hasn't because he missed the punch line of the section on the paradox: you need a causal model to separate the two branches of the paradox. It's as easy to construct examples where the overall view is correct as it is so construct examples where the separate views are.

2 comments

jonahx 2671 days ago

I'm unclear: what was the incorrect claim you're saying the author made?

link

whatshisface 2671 days ago

The parent is not saying the author made an incorrect claim. They are saying that the parent did not continue their argument to arrive at a conclusion that someone else had, the conclusion that causal models are what tells you when you can combine datasets and when you can't.

link

chii 2670 days ago

> causal models are what tells you when you can combine datasets and when you can't.

but then the causal model is subjective right? What if there are two different causal models, and a priori cannot be known which is the "true" one?

Can the selection of the causal model be used to justify the dataset, in order to push a particular agenda?

link

whatshisface 2670 days ago

Your job when analysing data is simply to enumerate the possibilities and assign likelihoods to them if possible. If two models fit equally well, you're supposed to write them both down in the hope that someone will collect further data to distinguish between them.

If you're cutting holes in your report for political reasons, that's just not doing the job. That's what pundits are paid to do, not (ideally at least) scientists. Fraud is easy to commit, and the fact that it's possible is not that hard of a philosophical issue.

link

chii 2670 days ago

How do you tell that a paper containing conclusions to support an agenda is written with correct scientific rigor, rather than fraud? Using Simpson's paradox, one can obfuscate their biases by making the desired conclusion drop out of the data.

link

tobiasSoftware 2671 days ago

Simpson's paradox is about a data conflict between an overall view and a more specific view. For example, in the kidney stone scenario, say you find treatment A is more successful overall and treatment B is more successful in the specific view at both treating small stones and treating big stones when broken down that way. The article indicates that the specific view is always correct so treatment B should be used in the future, whereas the commenter is saying that context is important to determine which treatment should be used.

link

mcguire 2671 days ago

Exactly. With a causal model (which can be validated independently) you have a principled reason for choosing which variables to control for.

link

forrestthewoods 2668 days ago

Post author here. Can confirm I’ve not read The Book of Why!

I’ll add it to my reading list.

link

mcguire 2667 days ago

A warning: it's seriously self-congratulatory. But I don't know of anything better.

link