Hacker News new | ask | show | jobs
by rm999 5052 days ago
It's not about small tweaks, it can be substantial additions to a model that improve its actual, out-of-sample performance. A popular method in these contests is ensembling, which involves building many sub-models and combining their scores into a single ensemble model. The netflix winner used ~100 sub-models in their ensemble, but the vast majority of the predictive power came from just three of those sub-models (can't find the source now).
1 comments

Ah, I think I see what you are saying: essentially that the time it takes to build and tune the blending method and model selection for a 100+ ensemble gives you only a slightly better prediction than an appropriately choosen reasonably performant model at both a large computation and human labor cost?

What I was addressing was the issue that some users on Kaggle seemed frustrated that people were essentially submitting models with small parameter tweaks in order to marginally boost leader board scores. To these complaints I would argue that over-fitting is it's own punishment.

Thanks for the clarification!