Hacker News new | ask | show | jobs
by saosebastiao 3388 days ago
There is nothing wrong with controls when they're relevant and have a convincing motive behind their inclusion.

What's not okay is sequentially trying all of the possible analyses and then stopping the moment you find the exact combination of variables that tells you what you had already assumed to be true. Especially so when simple analyses point to A, and you keep adding new variables until you get to B, which is exactly the case here. That is a very well known abuse of statistics. There is a reason all the well known and popular Information Criterions (which measure model quality) are parameterized by the number of parameters in the model.

And while adding control variables isn't per se bad, there are proper precautions to take when doing so, which become exponentially more costly the more you add. Such as segmented sampling, non-linearity transformations, and even controlled experiments. Because these fraudsters have a motive, the model only needs to be as rigorous as necessary to secure their predetermined conclusion. The "keep adding variables" model almost always ends up as a way to lie with statistics.

1 comments

Wow, I've seen so many comments by climate change deniers using arguments so similar to yours that I literally had deja vu.

The horseshoe nature of politics will never cease to amuse me, though; thanks for the example.

Also, if a hypothesis is supported by simple studies but falls apart under more complex ones, it might be too simplistic a hypothesis. Almost as if sexism (and sexism guilt-slinging) wasn't an entirely black-and-white problem. Who'd have thought?

Please don't add snark into inflammatory comments about already divisive topics. That's destructive of the kind of discussion we're hoping for here.

https://news.ycombinator.com/newsguidelines.html

https://news.ycombinator.com/newswelcome.html

Climate Change Deniers are some of the biggest users of the flawed analyses that I'm talking about. For example, several climate change deniers argue that if you control for urban heat island effects, the magnitude of the human component decreases substantially. Such willy-nilly expansions of model complexity fall apart when modeled more rigorously [0][1].

You'll notice that I've never claimed that new parameters universally make the model worse. All of the common Information Criterions [2][3][4] are numerically capable of improvement of model quality with increases in parameter space...it's just unlikely. Such p-value hunting might give you the p-value you're looking for, but it is very unlikely to improve the model.

Sexism is a hard problem. Throwing variables wantonly at models until sexism disappears isn't doing anything for the problem...it's nothing more than a pseudoscientific way of pretending it doesn't exist.

[0] https://www.skepticalscience.com/urban-heat-island-effect.ht...

[1] http://news.stanford.edu/news/2011/october/urban-heat-island...

[2] https://en.wikipedia.org/wiki/Bayesian_information_criterion

[3] https://en.wikipedia.org/wiki/Deviance_information_criterion

[4] https://en.wikipedia.org/wiki/Akaike_information_criterion