| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by zamfi 1230 days ago

What's missing is that they did not use the test set during the training process.

A common split is train/validate/test, but all three are used during training -- train to actually train, validate for intermediate loss, test for model comparison.

What you want is a fourth, held-out test set that isn't looked at until publish time.

This paper has two test sets, but they have different data properties, and it's not clear they were held out until publication.

1 comments

whatshisface 1230 days ago

I can't believe you can publish with such an obvious flaw in study design. You don't have to have machine learning expertise to catch that, because the same idea applies to backtesting any kind of model at all.

link

YeGoblynQueenne 1230 days ago

And yet they do. I don't remember ever reading any machine learning paper where the authors point out that they held out a test partition that they couldn't access until some "final" step of the experiment.

link