Hacker News new | ask | show | jobs
by tomp 2994 days ago
> You cannot dismiss rigorous statistical analysis by arguing it can never encompass the full dimensions of the data.

This simply means it's not rigorous. See Omitted-variable bias - from [1]: The bias results in the model attributing the effect of the missing variables to the estimated effects of the included variables. For example, including gender but not education or hours worked will result in attributing pay differences to gender, but including all relevant variables shows that's gender is irrelevant.

https://en.wikipedia.org/wiki/Omitted-variable_bias

1 comments

You entirely missed the point. We can never include every single relevant variable to perfectly explain the observation. It's impossible.

This doesn't mean statistics is useless.

This is the meaning of the phrase "the map is not the territory". All models are flawed, but some are useful.

No, statistics aren't useless, but its usefulness cuts both ways: if you can add one or two relevant variables and almost entirely remove the observation, then statistics tells you that the observation was only there due to omitted-variable bias.
Sure, that's fine. That's part of using statistics to the best of our ability.
I think there's some middle ground between saying that the analysis is rigorous and saying it's useless, no?
Indeed. But that's where all the hard work is -- trying to determine how rigorous something is, knowing it certainly isn't perfectly rigorous.