|
|
|
|
|
by YeGoblynQueenne
1436 days ago
|
|
As an addendum, note that training for a competition does not eliminate this
overfitting to the test set. Most competitions make the test set instances
available, though not their labels. Many restrict the number of submissions that
can be made but usually accept several. There's even a bit of jargon regarding
how to game this, it's called "hill climbing the test set" (really, it's hill
climbing the performance on the test set, i.e. it's the accuracy on the test set
that's optimised). Here's an actual how-to: https://blockgeni.com/how-to-hill-climb-the-test-set-for-mac... One benchmark I know where the test set is completely hidden is François
Chollet's ARC dataset, and that's done precisely to preclude overfitting to the
test set. |
|