|
|
|
|
|
by nonbel
3080 days ago
|
|
If you cv on a dataset, then change the features (or hyperparameters) and cv again, picking the best model, then you will will overfit to the cv. This is data leakage, it will lead you to be overly optimistic about your model performance on unseen data. This is well known, and honestly only takes one time working with a real hold out set (no cheating) to learn for life. Eg:
https://datascience.stackexchange.com/questions/17288/why-k-... |
|