Hacker News new | ask | show | jobs
by adhoc32 454 days ago
Instead of training on vast amounts of arbitrary data that may lead to hallucinations, wouldn't it be better to train on high-resolution images of the specific subject we want to upscale? For example, using high-resolution modern photos of a building to enhance an old photo of the same building, or using a family album of a person to upscale an old image of that person. Does such an approach exist?
4 comments

Author here -- Generally in single image super-resolution, we want to learn a prior over natural high-resolution images, and for that a large and diverse training set is beneficial. Your suggestion sounds interesting, though it's more reminiscent of multi image super-resolution, where additional images contribute additional information, that has to be registered appropriately.

That said, our approach is actually trained on a (by modern standards) rather small dataset, consisting only of 800 images. :)

It feels like it's multishot nl-means, then immedeately those pre-trained "AI upscale" things like Topaz with nothing in between. Like, if I have 500 shots from a single session and I would like to pile the data together to remove noise and increase detail, preferably starting from the raw data, then - nothing? Only guys doing something like that are astrophotographers, but their tools are .. specific.

But for "normal" photography, it is either pre-trained ML, pulling external data in, or something "dumb" like anisotrophic blurring.

I'm not a data scientist, but I assume that having more information about the subject would yield better results. In particular, upscaling faces doesn't produce convincing outcomes; the results tend to look eerie and uncanny.
Not a data scientist, but my understanding is that restricting the set of training data for the initial training run often results in poorer inference due to a smaller data set. If you’re training early layers of a model, you’re often recognizing rather abstract features, such as boundaries between different colors.

That said, there is a benefit to fine-tuning a model on a reduced data set after the initial training. The initial training with the larger dataset means that it doesn’t get entirely lost in the smaller dataset.

That is how Hollywood currently de-ages famous actors, by training on their photos and stills from when they were around the desired age.

But it's extremely time-consuming and currently expensive.

That is effectively what it's doing already. If you examine the artifacts, there is obviously a bias towards certain types of features.