It is pretty impressive/crazy how well CDM and SR3 work together to go from 32x32 to 256x256 e.g. the Irish Setter. How could the algos possibly know the lighter coloring (due to breed or lighting) between the dog's eyes?! It's basically inventing pixels
Even basic upscaling algorithms can guess a surprising amount of detail.
When I was putting together a simple and fast method, a while back, I compared my own to the very, very, basic and ended up with this [0].
The far left is the original, the others are just shifting the scale percentage. There's a surprising amount of detail kept, even though all of the algorithms were pushed way beyond what should be considered their limits. (Purposefully - to expose bias that was easier to analyse.)
If we start with multiple source images that are "small" (by some definition of small) perturbations of each other and upscale them, what can be said about the results?
Some of the images in the article seemed to be high-res images that where downscaled to low-res (and it makes sense to see how the upscalling process changes the original), but wouldn't that make it easier for the ML to revert the downscaling process rather than taking an original low-res photo and upscale it?
This is true. Downscaling an image and then training a neural network to scale it back up is the way single-image superresolution systems typically work.
Research papers need to evaluate their models, and how can you evaluate a scaled-up image unless you have the original ground truth to compare it to?
This can introduce a dataset shift bias. For example, if you train a network to upscale 1080p movie frames to 4k, the results might be disappointing when you try to scale 4k to 8k.