| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by zwaps 2853 days ago

I think that ML is very useful, but remember that forecasting is really not the main objective of econometric models.

Basically, forecasting implies you have a good handle on all properties of the relevant distributions, which in my opinion is a lost cause in social sciences (think external validity).

Instead, econometrics is nowadays mainly concerned with the identification of causal effect using non-parametric or semi-parametric approaches. Basically, you can believably estimate the directionality of some mechanism, but you probably never have the data or model to make a good out of sample prediction. You can, but it's basically implied that approaches that consistently estimate some marginal of a conditional expectation will NOT be that useful to predict a whole stochastic process.

Also, using training and test sets kind of predicates that your process is very stable. Otherwise the "test" set is not really a good test, is it? Again, in social sciences these things are hard to argue. You usually wanna generalize some mechanism from this industry to that industry, not find a good predictor in the same industry. Test datasets still run on the same data!

ML is successful because in practice we DO care about prediction. This allows us to do all the cool things. Because econometrics/stats is so conservative and comes from a causal standpoint, people are just really shy to develop a model for prediction (not everywhere true, but that's the gist). For ML, the primary question is basically how good the thing predicts. When I first tried scikit learn way back, I was so confused it didn't offer standard errors or some other statistical measure. But then I saw how ingrained the in-sample, out-sample process is and I thought well - that's really useful.

tl;dr: Stats and ML have different objectives, but there is a lot to learn in stats for ML