Hacker News new | ask | show | jobs
by artine 537 days ago
I’m not closely familiar with this benchmark, but data leakage in machine learning can be way too easy to accidentally introduce even under the best of intentions. It really does require diligence at every stage of experiment and model design to strictly firewall all test data from any and all training influence. So, not surprising when leakage breaks highly publicized benchmarks.
1 comments

> data leakage in machine learning can be way too easy to accidentally introduce even under the best of intentions

And lots of people in this space definitely don’t have the best of intentions.