| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by stu2b50 1070 days ago
	I don't think it's a scandal, it's a natural thing that happens when iterating on models. OP doesn't mean they literally train on those tests, but that as a meta-consequence of using those tests as benchmarks, you will adjust the model and hyperparameters in ways that perform better on those tests. For a particular model you try to minimally do this by separating a test and validation set, but on a meta-meta level, it's easy to see it happening.