A bit of a let down that the video demoing SDR->HDR conversion is itself only published in SDR. Makes as much sense as demoing a colorization tool in a grayscale video!
At this point, with any new model I think it makes sense to wait until you can run the model on your own input before making any assumptions based on cherry picked examples.
If they were serious about showing this tech off they should've provided a video file download. Also indicate that it's a HDR file and should only be viewed on a HDR display. Youtube is just making this look bad as people won't see a difference.
YouTube tends to post a downscaled SD version first, then they encode and post the higher-res versions when they get around to it. This can take days in some cases. Meanwhile the creator catches the flak...
You don't need high res for HDR on YouTube (144p HDR is a thing there oddly enough) and the 4k version had already processed when I posted that comment (with no change since in HDR availability). Usually media announcements/large channels pre-upload the video so it's ready when they want it to actually publish to avoid that kind of issue though.
4K processing takes just minutes, but HDR processing can take over a month to… never. There is no indication of this at all, no progress bar or eta. Just check manually every few days!
This is why everyone is giving up on HDR, it’s just too painful because the content distributors are all so bad at it, with Netflix being the sole exception.
It's more reliable then on linux though, and windows has been doing "auto HDR" for videos for years, so kinda hard to tell when something is HDR or not there.
At least as of a couple of years ago, HDR support on YouTube has been pretty bad[1]. I know they've been working to improve things since, but I kind of don't blame people for walking away from that mess.
I'm also glad that Rambalac is back as he quit a few months ago. I've recently also started uploading 4K60 HDR content to YouTube [1] and it takes up to one week more time for them to encode than the SDR version. You can include your own LUT instead of YouTube conversion which seems to help. Here's an article and LUT [2] + a video [3] with valuable info. They allowed me getting DJI Pocket 3 HLG recordings to HDR10.
Can´t recommend Rambalac enough - I have pretty much re-traced his steps multiple times during our Japan trip a couple times & it really helped with orientation. :)
Also some of the walks are really interesting & really gives you the context of various places in Japan. :)
- Some DaVinci Resolve Settings to use on SDR monitors: https://youtu.be/4izJfgRtkZE (though I upload 4K60 HDR at 37.5Mbit) which is enough for me slow content.
This only allows a single LUT for the entire video. For comparison, Resolve will perform Dolby Vision tone mapping from HDR to SDR on a clip-by-clip basis.
I guess. There's a lot of details we don't know that would change the calculus on this.
To use a analogous workflow, it could be like saying, "It's pointless to shoot video in 10-bit log if it's going to be displayed on Rec.709 at 8-bits." It completely leaves out available transforms and manipulations in HDR that do have a noticeable impact even when SDR is the target.
Again, we can't know if it's important given the information that's available, but we can't know if it's pointless either.
I could see a future where this works really well. It doesn't seem to be the case right now though.
The "super resolution" showcased in the video seemed almost identical to adjusting the "sharpness" in any basic photo editing software. That is to say, perceived sharpness goes up, but actual conveyed details stays identical.
Allegedly the new one plus phone does this trick in real time as well as up sampling and interframe motion interpretation. Mrwhostheboss seems impressed, but I don't really trust his yet judgment on these things.
Whatever special sauce the Nvidia shield uses is honestly incredible. Real time upscaling of any stream, and not just optimized for low res source, its like a force multiplier on content that is already HD. Supposedly the windows drivers do it as well but the effect seems less noticeable to me in my tests
An HN search of ''Deep Space Nine'' and ''Topaz'' will show some great discussions here covering the dearth of such upscaling solutions, as well as some huge efforts before commonplace AI.
Links to the artciles found in those discussions are to some very enlightening efforts by Joel Hruska to find an upscaling solution related to what the person asked about. As somebody else mentioned, Topaz is out there and Hruska gave it a good shot, but it is not open source.
It's not exactly what you're after, as it's anime specific and you need to process the video yourself (eg disassemble to frames, run the upscaler, then assemble back to a movie file), but Real-ESRGAN is very good for cleaning up old, low resolution anime:
If you want to avoid manual processing, Anime4K runs in real time as a GLSL shader you can load into MPV (or Plex or something called IINA according to the readme) and still gives great results.
It depends on what do you mean by 'open-source', along with training materials and full setup? That will be hard to find. Upscaling was popular like 10 years back. That's why there is no much interest today. Training in old style isn't that hard. But artifacts are popping up in all videos I've seen.
The RTX video upscaling feature works really well, there's a bug in the Firefox implementation that allows you to switch between native and upscaled side by side and the difference is striking. I don't have an HDR monitor so I can't tell you how well this new HDR feature works.
I don't think making things up is the problem, it's if it's believable. If it's indistinguishable to a viewer, then who cares. I never would have thought the HDR of the clouds was "made up".
I would use it on every single video I've ever made myself, because intent had nothing to do with how my videos look. They were made with then best camera I had available, and HDR has only been available relatively recently.
This is a tool that I want to use. If nobody can tell I used it, that's a good thing for me. If you don't want to use it, then nobody is making you.
I recently had some old super8 films shot by my parents scanned into 1080p resolution in ProresHQ. Because of the poor optics of the original camera, imperfect focus when shooting, poor lightning conditions, and general deterioration of the film stock, most of the footage won't get anywhere near what 1080p could deliver.
What I'd like to try at some point is to let some AI/ML model process the frames, and instead of necessarily scaling it up to 4k etc., 'just' add (aka magic) missing detail into 1080p version and generally unblur it.
Is there anything out there, even in research phase that can take existing video stock, and then hallucinate into it detail that never was there to begin with? What NVidia is demoing here seem like steps to that direction...
I did test out Topaz Video and DaVinci's built-in super resolution feature, both of which gave me a 4k video with some changes to the original. But not the magic I am after.
I also restored some Super 8 footage recently and had great success. The biggest win I had wasn't resolution, but slowing down the speed to be correct in DaVinci, and interpolating frames to make it 60fps using the RIFE algorithm in FlowFrames. I then used Film9 to remove shake, colour-correct, sharpen and so on.
Correcting the speed and interpolating frames added an amazing amount of detail that wasn't perceptible to me in the originals (albeit it was there).
All of this processing does remove some of the charm of the medium, so I'll be keeping the original scans in any case.
I bought one of the cheapish (€300) Super 8/8mm scanners on Amazon. It scans quite quickly while displaying the results on a small screen.
It's a nice convenient device, but I can't now unsee the artifacting and compression arising from it. If I were to do it again I'd just pay a service to scan properly, or build a rig to photograph the frames.
On the other hand, I'm very pleased to have scanned and archived the films given that they've been unseen for so long and can now be shared easily.
An interesting thing about Super8: the resolution is generally very poor, but it can have quite the dynamic range. Also, with film in general (and video, but it's easier with film because you have global shutter) you can compensate motion blur and get more detail out which isn't visible when you look at the film frame by frame. And none of this needs AI.
Regarding hallucination, I agree with the sibling comment, the problem is that faces change. And with video, I'm not even sure the same person would have the same face in various parts of the video...
there is AI tech to do this already. it has a slight problem, though: it adds detail to faces (this is marketing speak for completely changes how people look).
The Shield is kind of an extreme outlier in today's environment. A device from 2015 that 9 years later is still one of the top tier choices in its (consumer) market is almost unheard of.
In fact it's reportedly the currently supported Android device out there with the longest support[0], it's crazy that mine still gets updates.
It really is awesome. I also enjoy the UI that allows you to side by side compare a stream and the difference is insane.
I have been meaning to see how well it handles streaming a desktop via moonlight to the shield to real time upscale a second monitor's content. I assume it's trained for video footage and not static UI components. The RTX windows drivers don't seem to upscale as well as the shield.
Making up information? The same can be said for most commonly used modern compressed video formats. Just low bitrate streams of data that gets interpolated and predicted into producing what looks like high resolution video. AV1 even has entire systems for synthesizing film grain.
The way i see it, if the ai generated HDR looks good, why not? It wouldn't be more fake or made up than the rest of the video.
Consumer devices have never been known for color accuracy and goes back a very long ways. The running joke in broadcast was that NTSC stood for "Never Twice the Same Color".
During the brief moment that 3DTV was popular, almost all 3DTVs had a mode that could "convert" 2D to 3D, based on movement in the scene and other pre-learned cues. "Things that look like people should be in front of things that look like scenery", and so on.
I miss 3D. I loved it, and I was sad that it didn't catch on. It enjoyed a longer life in Europe, where 3D blu-rays were produced for a few more years after they stopped selling them in the US, and I imported and enjoyed several.
Maybe Apple's VR headset will be a 3D renaissance.
The main reason at home 3D failed is because most people don't watch at home like they do at a theater.
At a theater you sit down knowing that you can't get up and leave until it's over. At home you are doing other things: eating, folding laundry, going to the bathroom, taking phone calls, answering the door, and so on. It's not conducive to wearing glasses.
Vision will have the same problem (as does any at home headset). I don't think it will lead to a 3D renaissance, at least not for a long time, until it becomes acceptable (and feasible) to walk around with it on all the time.
Otherwise we need to wait for holographic projectors that can make a 3D image without having to wear glasses that make it hard or impossible to look at other 3D objects.
While watching a movie, look away - maybe. Get up and walk around and do chores - we always pause if needed. I think it's a matter of establishing that to watch a movie, one needs to set aside time and commit to focus just on it, but then it becomes yet another barrier.
Different story for TV shows which often are background though.
We're going to have at least one episode of those lawyer shows where they pressed enhance, and the neural network hallucinated something that wasn't there.
The work I am interested in this broader domain is conversion (say, via some NeRF) of existing standard video into spatial video e.g. MV-HEVC for immersive experience on the Vision Pro etc.
it doesn't have to be "fake" detail, an AI can use multiple frames to gather much more information than is available in a single frame and composite them into a much more detailed image
I can pretty easily distinguish useful LLM output from non-useful LLM output though others on this website seem to have lots of trouble. I think I can pretty easily do the same for things in the visual field. To be honest, part of why I'm successful is that I can draw use out of imperfect tools.
Oh boy. Fake detail is well, fake. It doesn't allow you to learn anything about the world beyond self-deception. But of course, if this is what rocks your boat then be my guest.
> Using the power of Tensor Cores on GeForce RTX GPUs, RTX Video HDR allows gamers and creators to maximize their HDR panel’s ability to display vivid, dynamic colors, preserving intricate details that may be inadvertently lost due to video compression.
There is so much marketing BS in one small paragraph. For starters, generating(/hallucinating) data is imho the opposite of preserving anything. Then HDR is less associated with "intricate details" and more to do with color reproduction. Finally, video compression is the one thing that usually does not have problems with HDR, even the now venerable x264 can handle HDR content, generally it's almost everything else that struggles.
Of course in a true marketing tradition, none of the things are also strictly false. I'm sure there are many ways to weasel the claims.
Not the OP, but you have to understand that 'compression of the dynamic range' is an artistic tool. Literally choosing the lighting ratio of an image is how you build out lighting for a scene. With AI overwriting these choices, you're looking at something more akin to colorization than upscaling.
Not really, half the battle with SDR video is tonemapping a high dynamic range to fit into SDR. That process is not artistic, it's a process on trying not to make it look bad.
I'm a filmmaker... I don't know a single DOP or director who wishes they could work in HD but is limited by finances or knowledge. Again, shaping light is the essence of cinematography. Modern DSLRs far surpass the dynamic range (although not the effective resolution) of 35mm film. And yet the image they produce isn't comparable. When it comes to image quality, bit depth is enormously more important than dynamic range. When it comes to creating an artistic image, dynamic range hasn't been a limit for many decades.
Very informative comment! Could you please tell me what you mean by "effective resolution?" Is it the resolution in px or something to do with dynamic range? (I don't know anything about filmmaking.)