| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by srean 2903 days ago

As much as I respect Brieman, I think he latched on too hard on his pet theory that all that ensembles do is reduce variance, and by doing so missed out on what boosting does.

Yeah, random forests work really well but they are layers and layers of hacks, thumb rules and intuition piled on top of the other. I cant claim with a straight face that any of them follows from solid principles.

Graycat and I have a history of discussing the differences between stats and ML here on HN. I just added a comment, up streams on the thread.

1 comments

davidsrosenberg 2903 days ago

I imagine Breiman was just talking about bagging-style parallel ensembles, when he was talking about variance reduction, not boosting-style sequential ensembles. Not long before he died, he was still actively trying to figure out why AdaBoost “works”. Don’t think he claimed to really understand that. He had experimental results that disputed the “it’s just maximizing the margin” explanation.

Saw the comments above — are you from a stats or ML background, or neither?

link

srean 2902 days ago

I am more ML than stats. BTW Brieman believed the same for Boosting. Later he got a little unsure. You will find this in his writings on Boosting

link

davidsrosenberg 2902 days ago

Interesting -- if you've got a link, please post it.

link

srean 2901 days ago

You can take a look at his technical reports from 2001 onwards, lot of very interesting material, lot of back forth between Freund, Schapire and Brieman in those. The reports continue to be hosted on Brieman's home page.

link