Hacker News new | ask | show | jobs
by w1 2346 days ago
The issue isn't that they found a copy of the test data online (The test input data was provided to them as part of the problem.)

The issue is that they manually labeled the test data, and then pretended they didn't.

The competition objective is to provide an ML solution that produces labels for the test data, showing your work with code (to prove you didn't just hand label the data.)

Instead, they did manually label the data, and hid their manual labels in the id column of that external data source.

1 comments

Ah, thank you, that part wasn't clear when I was reading it!