|
|
|
|
|
by stiff
4124 days ago
|
|
If you use performance on the test set for model selection, this is not true. It follows from simple probabilistic reasoning, the more models you try the higher the chance one will score well on both the training set and the test set by "luck", and this is especially true with small datasets. In fact it is a best practice to use a separate validation set for model selection and use the test set only for final performance evaluation, see e.g. the answer to this question: http://stats.stackexchange.com/questions/9357/why-only-three... |
|