Hacker News new | ask | show | jobs
by erichahn 1877 days ago
No, this does not solve the problem that he describes in the article. You can have a great crossvalidation score and still struggle on unseen data if the data is relatively dissimilar from your train set. Like X-Ray scans produced from a different machine. There are numerous other examples. CNNs on images for example are famously known to disintegrate on images + white noise (which look the same to a human).
1 comments

For the last example, could they train on "images + white noise" instead ?