Hacker News new | ask | show | jobs
by mbq 4733 days ago
You are just doing a simple validation on a test set rather than cross-validation; the point of CV is to make many iterations of validation on different train-test splits and average the results.
1 comments

I agree completely a more complex benchmark should be done with a complete cross-validation.

Just for future reference I did ran the fitting a few times founding very(+-2%) similar results. Also Random Forests do an average so probably not much to improve on that particular algorithm.

To be honest I don't expect the results to change; but this is an only way to attach significance to the observed differences and to ensure this wasn't a lucky shot.