Hacker News new | ask | show | jobs
by mbq 4029 days ago
It is a common misconception and a huge source of disappointment with ML -- without proper validation of the whole model building procedure (method selection + parameter tuning + feature selection + fitting) no amount of data and magic tricks will make you sure that there is no overfitting. Even a single hold-out test is risky because gives you no idea about the expected accuracy variance.
1 comments

Well, you can use the bootstrap to calculate the variance. It costs computation. But it works. Cosma Shalizi wrote a really nice introduction to it: http://www.americanscientist.org/issues/pub/2010/3/the-boots...