|
|
|
|
|
by bunderbunder
2785 days ago
|
|
Logistic regression isn't sexy, but it can still achieve near state-of-the-art results, is reasonably resistant to bias^H^H^H^H variance, and generates parameters that you can easily explain to someone with no background in math. There's a lot of value in all that. Especially if your deliverable is something that a business is going to use, and not just a Kaggle entry. |
|
I know it _seems_ that way, but there's a surprising amount of nuance there and I think we're both fooling and limiting ourselves by letting this idea fester.
For one, unlike linear regression, logistic regression estimates aren't collapsible, so you can NOT interpret them as "changing this input by X changes the output by Y". That's only true if your set of covariates is perfect, which is never true, though in practice this interpretation might not be _that_ far off.
Another issue I see is practitioners not being aware of scaled/unscaled estimates; I've seen real papers from AI groups use logistic regression estimates like feature importance rankings, but using estimates in the scale of the original features, and not understanding the distinction when confronted about it.
From a practical sense, I think practitioners are much better served using random forests as their initial exploratory models. Less effort for results that are in practice at least as good as a well-prepped logit. Plenty issues with feature importance there, but not any worse than with logistic regression.