| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Leynos 3526 days ago
	A friend of mine suggested that an approach similar to this could be used to upscale old standard definition TV shows (specifically, those shot on video rather than film). I'd imagine that multiple specially trained networks would be employed for different parts of the image (trained on pictures of individual performers or types of set/background). Pleased to see that this is possible. Is there anyone doing something along those lines already?

4 comments

Mithaldu 3526 days ago

It should also be possible to train it on itself to improve moving scenes by using the motion itself as temporal super-sampling, just like the human eye does.

link

mm_alex 3525 days ago

this works quite well, and does not necessarily require any NN/machine learning. see the youtube for this paper https://www.disneyresearch.com/publication/scenespace/ tldr simple brute force weighted average of samples from many frames, combined with a noisy/low quality depth-from-motion estimate can be used to de-noise, increase resolution and otherwise manipulate video footage. very cool paper with great results from a simple technique.

link

ttul 3526 days ago

Ooh

link

louprado 3526 days ago

As you suggested, continuity of appearance is what makes this problem so difficult.

I recall watching a movie that was converted from black-and-white to color as a child. There were many distracting artifacts. Most notable was the hairlines of the actors would shift as the actor rotated their head. It made the film unwatchable.

link

nuclai 3526 days ago

(Author here.) Absolutely! Using multiple super-resolution networks, not only continuity would present problems, but also blending between different regions. I agree there's a lot of value for domain-specific networks here, as you can see from the faces example on GitHub.

I'd be curious to see an ensemble-based super-resolution, where each model can output the confidence of a pixel region, then have another network learn to blend the result.

Conversely, these results are achieved using a single top-of-range GPU. Everything fits in memory for a batch-size 15 at 192x192. By distributing the training somehow, you could make the network 10x bigger and train for a whole week and likely get much better general purpose results.

link

Keyframe 3526 days ago

Is there anyone doing something along those lines already?

I have a side business doing Film restoration and am not aware of any solution like that. Probably the best upscale solution there is is from Teranex, acquired by BlackMagic Design. Evertz probably also has something in their offering.

link

Houshalter 3526 days ago

It should work, I don't think you need to bother with training it on individual performers. Someone made a thing like this to improve low res anime, that worked well.

In theory you could use this to increase temporal resolution as well. Turn 24 fps movies into 60 fps, and upscale regular HD to 4k.

link