| > GBDTs are still unbeatable. You'd be surprised how many times I've replaced a GBDT with logistic regression and had negligible drop off in model performance with a dramatic improvement in both training time as well as debugging and fixing production models. I've had plenty of cases where a bit of reasonable feature transformation can get a logistic model to outperform a gbdt. Any non-linearity your picking up with a GBDT can often easily be captured with some very simple feature tweaking. My experience has been that GBDTs are only particularly useful in Kaggle contests, where minuscule improvements in an arbitrary metric are valuable and training time and model debugging are completely unimportant. There are absolutely cases where NNs can go places that logistic regression can't touch (CV and NLP), but I have yet to see a real world production pipeline where GBDT provides enough improvement over Logistic Regression, to throw out all of the performance and engineering benefits of linear models. |
I feel these two things often influence too much the course of Machine Learning research and communities, and this is not good. Most ML researchers and pratictioners are barely aware of the latest advances in parametric modelling, which is a shame. Multilevel models allow you to model response variables with explicit dependent structures. This is done through random (sometimes hierarchical) effects constrained by variance parameters. These parameters regularize the effects themselves and converge really well when fitting factors with high cardinality.
Also, multilevel models are very interesting when it comes to the bias-variance tradeoff. Having more levels in a distribution of random effects actually DECREASES [1] overfitting, which is fascinating.
[1] https://m-clark.github.io/posts/2019-05-14-shrinkage-in-mixe...