Hacker News new | ask | show | jobs
by fwilliams 2892 days ago
Somewhat tangentially, some recent work showed that a lot of problems with images (e.g. denoising, upsampling, inpainting, etc...) can be solved very efficiently with no training set at all: https://dmitryulyanov.github.io/deep_image_prior

This work shows that the network architecture is a strong enough prior to effectively learn this set of tasks from a single image. Note that there is no pretraining here whatsoever.

More to your point, I think a big problem with toy tasks are not so much the tasks but the datasets. A lot of datasets (particularly in my field of geometry processing) have a tremendous amount of bias towards certain features.

A lot of papers will show their results trained and evaluated on some toy dataset. Maybe their claim is that using such-and-such a feature as input improves test performance on such-and-such problem and dataset.

The problem with these papers often comes when you try to generalize to data that is similar but not from the toy dataset. A lot of applied ML papers fail to even moderately generalize, and the authors almost never test or report this failure. As a result, I think we can spend a lot of time designing over-fitted solutions to certain problems and datasets.

On the flipside, there are plenty of good papers which do careful analysis of their methods' ability to generalize and solve a problem, but when digging through the literature its important to be judicious. I've wasted time testing methods that turn out to work very poorly.