Hacker News new | ask | show | jobs
by claytonjy 2790 days ago
I had exactly that post in mind, it really raised my awareness of these issues.

I agree with Jake's interpretation of the conditional interpretation of the estimates, but the practical issue is that virtually nobody not well-educated in statistics will do that correctly. In particular, people tend to do exactly what Jake concedes rarely makes any sense, which is comparing estimates across different model specifications.

You and I might interpret these betas just fine, but if we show them to a less stats-y audience, will they?

1 comments

I guess it depends. I have the luxury of working in a very "this is machine learning, which is not to be confused with statistical inference" problem domain. It doesn't really even really make sense to interpret most the models I build as describing any sort of causal relationship, and when people are looking at the parameter estimates, they're really just trying to figure out, "What does this model think is important?"
That sounds nice!

Feature ranking seems like a clearly safe interpretation of betas, though I've been bitten too often by letting glm (in R) scale my predictors, giving me back estimates on the original scales, and thus incomparable, and seen it happen to others even more. Easy to miss when your original scales aren't all that different.