Hacker News new | ask | show | jobs
by jre 3843 days ago
They have a separate test set. But by submitting a lot of entries with slightly different parameters, you can optimize your parameters based on the test score.

Some (all ? ) Kaggle competition also have a daily submission limit to avoid this kind of cheating.

1 comments

I think you misunderstand. Kaggle has a training set (known to all participants), a "public leaderboard" test set (secret) and a "private leaderboard" test set (also secret). You can get your model's score on the "public" test set a couple of times per day. Your score on the private test set is only revealed once, after the competition has ended.

People can, and do, overfit their model to the public test set, but doing so does not improve their score on the private test set, so cheating is prevented even without the submission limit.

The submission limit helps ensure the leaderboard generated from the public test set stays close to the leaderboard generated by the private test set while the competition is running, so that you can get an idea about your standing. But participants know better than taking the public leaderboard too seriously.

I stand corrected :) It seems like ILSVRC only has 2 datasets though, I wonder why they don't use a Kaggle-like approach ? My guess would be because they want people to be able to know their test score in order to put them in their papers.