Hacker News new | ask | show | jobs
by carrolldunham 1802 days ago
I could be wrong but your expectation seems to involve fantasy - I didn't think anybody expected any algorithm to rediscover the real details. Is that not impossible unless the model was trained on the specific image it's recreating (which would be pointless)?
1 comments

That's not my expectation. As I said,

> Even at its best the algorithm seems to come up with something plausible but often wrong (exactly as expected)

So the expectation is that the algorithm is only as good, or slightly worse, than what a human would assume the original photograph looked like. SR3 meets that expectation, more or less.

That said, I would argue that there's no reason to assume that humans are equipped with maximally efficient upscaling / content-aware filling algorithms in our brains, either. It might certainly be possible to come up with a ML approach that was able to discover details in the image that were entirely lost on even highly trained human viewers.

I also wanted to point out in my comment that the approach is still pretty flawed in a lot of ways, even compared against the yardstick of human intuition. The woman's hair on page 16 of the paper doesn't pass a plausability test - it just looks like goo.

At the end of the day, what a lot of people (in this thread and elsewhere - see this previous Google research https://hific.github.io/) are interested in is a ML approach that can achieve image compression that is vastly superior to any current approach. That requires being able to recover the real details. My comment answers the question many people probably have - which is that this algorithm doesn't do that (at least with the sample images given).

It might certainly be possible to come up with a ML approach that was able to discover details in the image that were entirely lost on even highly trained human viewers.

This is certainly a fair point but that's a different question from recreating the real details from a compressed image where that data is just gone. There are an infinity of images that you can downsample to say, a 2 pixel by 2 pixel image. You can make an algorithm to produce candidate sources but not distinguish between which was the actual source. The data processing inequality is the formal result covering this. So where you are saying this algorithm doesn't do it, it's moot because it can't be done