Hacker News new | ask | show | jobs
by ramraj07 2575 days ago
But we are merging hundreds of trees each of which has been handicapped by removal of multiple features and a fraction of the data. Sounds to me like overfitting is not easy (no single data point or feature contributes to every tree so it can't be represented all the time).

False claims as they maybe, these are claims I've seen in at least two of the most commonly studied statistical learning text books, so given that it makes sense and that it's in the text books, it seems reasonably not false to me. Someone else posted that if too many features or data points are very similar then it will overfit, and that totally makes sense. Whatever you say doesnt. Clarification would be useful.

1 comments

Adding bunches of trees will overfit the accidental patterns in your data.

I have an explanation here why reducing variance is not the same as reducing overfitting: https://news.ycombinator.com/item?id=20089890