Hacker News new | ask | show | jobs
by jheriko 3465 days ago
it is true, video is a nightmare mess littered with weird functionality nobody needs. (limited range only just disappeared in rec 2100, optionally??? really??? i'm not worried about my electron gun in my CRT from 1975 these days...nor do i want to know what a Y or a Cb or a Cr means because everything is RGB and B&W TV is long dead... and 4:2:2 is not exactly compression so much as computational overhead etc.. etc.)

its a nightmare, but the reason for these observations is precisely that it shouldn't be a nightmare. this area of programming is a wasteland ... nobody that good wants to solve these trivial problems :/

2 comments

Chroma subsampling isn't going anywhere. You'll usually get subjectively better quality with 4:2:0 chroma compared to 4:4:4 at the same bitrate. And this means you can't have everything in RGB, so all the colorspace conversion complexity can't be ignored.

Try experimenting with chroma subsampling in JPGs, but note that not all image viewers have good chroma upscaling. MPV can display still images as well as video and you can choose the chroma scaling algorithm.

> Chroma subsampling isn't going anywhere. You'll usually get subjectively better quality with 4:2:0 chroma compared to 4:4:4 at the same bitrate. And this means you can't have everything in RGB, so all the colorspace conversion complexity can't be ignored.

What's more, YCbCr is more efficiently compressed than RGB even if you don't subsample, for the same reason that a DCT saves bits even if you don't quantize: Linearly dependent or redundant information is moved into fewer components, in this case most of the information moves into the Y channel with the Cb and Cr both being very flat in comparison. (Just look at a typical YCbCr image reinterpreted as grayscale to see what I meant)

isn't it the case that amount of data required to store the result of a lossless DCT is bounded below by the size of the data, and this is why lossless JPG compression does not use such a scheme?
I'm not actually sure. In retrospect, I'm not sure what ‘DCT without quantizing’ really means, since the output of the cosines are probably real numbers? I guess the interpretation would be quantized to however many steps to reproduce the original result when inverted (and rounded).

In lossless JPEG it seems they omitted the DCT primarily for this reason: It not being a lossless operation to begin with, if you actually want to store the result. What other lossless codecs often do is store a lossy version such as that produced by a DCT, alongside a compressed residual stream coding the difference (error).

In either case, it's important to note the distinction between reordering and compressing; reordering tricks like DCT can reorder entropy without affecting the number of bits required to store them, but the simple fact of having reordered data can make the resultant stream much easier to predict.

For example, compare an input signal like this one:

FF 00 FF 01 FF 02 FF 03 FF 04 ...

By applying a reordering transformation to move all of the low and high bytes together, you turn it into

FF FF FF FF FF .. 00 01 02 03 04 ..

which is much more easily compressed. As for whether that's the case for (some suitable definition of) lossless DCT, I'm not sure.

you are right, there is a weird grainy sharpness kind of 'feeling' when chroma is not subsampled...

but you get the exact same effect from higher resolutions, e.g. going from SD->HD->2K->4K we see the same thing... and we are still doing it, so i would question highly that it is subjectively better in a long-term sense given this continuing trend.

i remember hearing people discuss this sort of thing when HD was new, and they stopped after while - i suspect because they got used to it, and they now realise how low the quality of the SD image was. i noticed this in myself as well...

edit: incidentally there is a discussion about this here (first google thing i found): http://www.neogaf.com/forum/showthread.php?t=1308591

its seems either nobody or very few are taking the perspective that 4:2:0/4:2:2 looks better, and there are even a few descriptions of precisely what they notice as being worse.

I believe you are misunderstanding the issue.

Nobody is trying to argue that 4:2:0 video looks objectively superior to 4:4:4 video if given a free choice. Obviously, full chroma information will always be better, such as is the case for something like a PC monitor vs a TV with subsampling.

The problem is that 4:4:4 chroma requires more bits to compress, so when you're designing a video/image codec, you have to ask yourself whether the difference in bitrate between 4:2:0 and 4:4:4 is worth the difference in quality, and the answer seems to be “no”.

This means that when you're serving, say, a 5 Mbps youtube video where the bitrate is already fixed, 4:2:0 is going to give you more bits to put into useful stuff (e.g. luma plane) instead of having to waste them on mostly-redundant chroma information.

Please let me know what you do when you have filter overshoots or undershoots in full range. Limited range is there for good reason.
er... what? can you explain that a bit i think i must misunderstand?

what i think of as undershooting or overshooting is relative to the range... and besides that, what is wrong with clamping? its how computer graphics has always had to deal with these things... limited range simply doesn't exist in that context, and it doesn't harm anything.

when computer games are forced into limited range for consoles you don't get these unless your tv is applying one of those god awful filters that ruins everything anyway... (i'm still not sure why so many tvs have these - reference monitors never do anything this insane) ... but i can tell you what you do get, a subjectively /and/ measurably worse quality of image than from a monitor.

(i don't think i'm alone in this based on the contents of the ITU-R BT.2100 either... which defines a full range as well as a 'narrow' one)

> what i think of as undershooting or overshooting is relative to the range... and besides that, what is wrong with clamping? its how computer graphics has always had to deal with these things... limited range simply doesn't exist in that context, and it doesn't harm anything.

As far as I understand it, limited range was historically used so you could use efficient fixed-function integer math for your processing filters without needing to worry about overflow or underflow inside the processing chain. You can't just “clamp back” a signal after an overflow happens.

Of course, it's pretty much irrelevant in 2016 when floating point processing is the norm and TVs come with their own operating systems, so these days it just exists for backwards compatibility with the existing stuff - which is a property that video standards have tried to preserve as much as possible since the early beginnings of television.