| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by BugsJustFindMe 488 days ago
	> Now, a lot of these studies try to "control for" the problem I just stated - they say things like "We examined the effect of X and Y, while controlling for Z [e.g., how wealthy or educated the people/countries/whatever are]." How do they do this? The short answer is, well, hm, jeez. You mean they don't cluster the data into sets of overlapping bins where the controlled attribute has approximately the same value and then look for the presence of an XY relationship within the bins instead of across them?

1 comments

Sniffnoy 488 days ago

No. What they actually do is that they do a regression with both X and Z among the independent variables, and then look solely at the coefficients coming from X. (As mentioned in the article.) Including Z as a dependent variable alongside X "controls for" it in that now the coefficients for X are supposed to not include any effect from Z (since any Z effect should go in the Z coefficients). How well this works is something I don't know enough to answer.

I don't actually know how the method you suggest compares in the limit of finer bins. It's possible it might only achieve similar results?

KempyKolibri 488 days ago

The smaller bins approach is adjustment via stratification.

Good primer on both here: https://www.mynutritionscience.com/p/statistical-adjustment

youainti 484 days ago

My understanding is that in the limit, it does the same thing, but with more of a flattened tree representation.