| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by xpe 1009 days ago

My comment did two things (but they were somewhat muddled). It: (a) laid a particular model; and (b) offered {explanations/claims} of causality. But unfortunately it said nothing about (c) experimental design.

I'll start with (c). Attempting to talk about a model in isolation from its experimental design can be misleading, as it ignores the context that gives the model its interpretive power and validity. In this case, a good experimental design must include a sufficiently diverse sample of people to account for variation.

Regarding (b), depending on the person, the influence could flow either way between `I` and `T`, to varying degrees.

- Example of `I->T`: One person might come into the store strongly preferring one type of ice cream (`I`) and be willing to take time to look for it (`T`)

- Example of `T->I`: Another person might come into the store in a hurry and be motivated to procure the closest ice cream flavor.

Regarding (a), no model is 'true' but some are better than others for particular purposes.

- To the extent that prediction is the key goal, confounding variables don't usually matter.

- But to the extent that _statistical inference_ is the key goal, there are many techniques for teasing apart influence.

Unfortunately, too often in machine learning contexts, the word "inference" refers to the process of using a trained model for _prediction_. Yikes. This contrasts sharply with the term's use in statistics. The field of statistics got this one right, even as ML techniques have taken off spectacularly.