| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kimukasetsu 1826 days ago

I strongly agree with this. Not to mention parameter interpretability and, in the case of Bayesian models, uncertainty estimates and convergence diagnostics. Such things are very important when making decision under uncertainty. Kaggle competitions and empirical benchmarks are very biased samples of model performance in real life.

I feel these two things often influence too much the course of Machine Learning research and communities, and this is not good. Most ML researchers and pratictioners are barely aware of the latest advances in parametric modelling, which is a shame. Multilevel models allow you to model response variables with explicit dependent structures. This is done through random (sometimes hierarchical) effects constrained by variance parameters. These parameters regularize the effects themselves and converge really well when fitting factors with high cardinality.

Also, multilevel models are very interesting when it comes to the bias-variance tradeoff. Having more levels in a distribution of random effects actually DECREASES [1] overfitting, which is fascinating.

[1] https://m-clark.github.io/posts/2019-05-14-shrinkage-in-mixe...

2 comments

borroka 1826 days ago

While I agree and it is surprising that multi-level/hierarchical modeling is rarely applied in industry (I used them extensively in academia and industry), dealing with hundreds or thousands of random effects in large data sets, especially in non-linear models, is a computational nightmare. And the benefits may not warrant those nightmares.

link

RA_Fisher 1826 days ago

Finally multi-level/hierarchical modeling is starting to permeate industry thanks to Stan and company.

I use hierarchical modeling regularly to help build Zapier. So do other companies like Generable: https://www.generable.com/

I suspect hierarchical models will become the next “new” hot data structure in software engineering due to their ability to compact logic. https://twitter.com/statwonk/status/1363104221747421184?s=21

link

borroka 1824 days ago

I don't know about permeating the industry. I know for example that the model that Airbnb used 3 years ago (things may have changed in the meantime) to forecast occupancy was a random-effects model maintained by a single person in Europe. I don't know about the penetrance of Generable and companies providing similar probabilistic modeling solutions, although I hope they succeed.

When I was working for one of the FAANGs, I was the only one using random effects models (that I know of), in particular non-linear random effects models with ~ hundreds of random effects. I was using a language/tool faster than Stan (fitting the same model with Stan would have taken hours, or more likely days), but making the models converge was always challenging. In addition, since most of my colleagues had a CS background and were in love with the latest not interpretable, brute force algorithm, and were scared of a more statistical approach they made no effort to learn, I faced pushback and skepticism despite the model working very well.

I love random effects model, and I build my technical career on them.

link

laichzeit0 1826 days ago

I think one of the main reasons is that there is no good Python library for doing linear mixed effect models. There is no sklearn implementation. There are some libraries that wrap R's lmer (probably using rpy2 or soemthing). The best native Python library I could find is statsmodels, and it has several shortfalls (saving a model to disk consumes hundreds of megabytes, the predict method is useless, it just predicts using the fixed effects, multi-level beyond just 1 group is not even clearly documented, and the syntax sucks if you really do it, nevermind actually implementing a predict method using those random effects). I think once someone does a decent sklearn implementation, it might take off. I've been thinking of doing an implementation for sklearn as a side project, but I'm not an ML researcher, just a practitioner, so it might suck :)

link

fho 1826 days ago

I used statsmodels for a while ... it's definitely possible to predict arbitrary inputs, it just a pain to fiddle in the right inputs ...

link