Hacker News new | ask | show | jobs
by fredophile 3955 days ago
I'll admit that I skimmed the article but I have the feeling this CNN didn't learn what they intended it to learn. Looking at the examples shown they started with a full resolution image and applied some downsampling algorithm to get the lower resolution to apply their algorithm to. Their algorithm has learned to undo the downsampling that they applied. This doesn't mean it will perform well on images that haven't been downsampled or images that have been downsampled in a different way.
3 comments

This is possibly true, but it's pretty much the only practical way to do this.

If we look at most of the literature around upscaling, this method is used pretty frequently.

For a more comprehensive look at using CNNs for image upscaling, see e.g. http://research.microsoft.com/en-us/um/people/kahe/publicati...

There is a more recent version of this paper published here: http://arxiv.org/abs/1501.00092v3

At Flipboard, we did not have time to do a full comparison of related upscaling research, but we were happy with the low amount of error our CNN achieved.

They down sampled by 2x, in other words, they just dropped half the pixel data. They then gave this half data to the algorithms. The full data was only used so they would have something to compare the output to.

How is that any different from just not capturing half the picture data? I don't see how it would be. You do realize a digital camera is just an array or sensors, right? What happens if your camera has half as many sensors? The same things as what they did, you have half as many pixels.

Is it possible to differentiate a downsampled image from an image captured natively at a given resolution? My gut tells me no, but this certainly isn't my field.
Not my field either. If a picture is just a big 2d array, then i don't think so, you're deleting bits, and information is lost.

If the picture is a collection of summed sin waves, maybe. If the big picture is just sampling more frequently, then maybe it's cheating by looking at the encoding. the smaller resolution will have sampling problems, it'll lose higher frequency data, because it's not sampled enough.

I dunno. I can see the op's point. Maybe there are artifacts introduced by scaling down. Still, regardless of the mechanism, information theory tells us there's no lossless compression. Information is lost, and the NN needs to make something up to fill in the blanks. Looks better than bicubic to me!