Hacker News new | ask | show | jobs
by folli 2977 days ago
In random forests you don't really need to find the optimal split. You usually generate a number of random splits and select the best one.
1 comments

I believe the formulation of random forests requires you to find the optimal split, albeit over a subset of features.

What you're talking about, where you simply generate a set of random splits across features, is Extremely Randomized Trees (https://link.springer.com/article/10.1007%2Fs10994-006-6226-...).

Since we're splitting hairs instead of training sets: Classical RFs select one random feature and choose the optimal split, whereas Extremely RTs choose the best feature (out of a random subset) whereby for each feature only one random split is tested.

Another difference is that RFs use a bootstrapped dataset and ERTs use the full dataset.

Ah, I was under the impression that RFs choose from a subset of features, not just one feature.

In any case, I agree with the thrust of your original comment that the specifications of the RF algorithm can be relaxed, usually for performance reasons, and still retain strong performance. But this goes back to my original comment that the performance considerations of random forests often aren't highlighted to new learners (whereas introducing ERTs to a beginner would probably shock them - how could you take totally random splits and still get any reasonable performance!)

> Ah, I was under the impression that RFs choose from a subset of features, not just one feature.

You are correct. For classification the usual rule of thumb is to select square root of the number of features.