Hacker News new | ask | show | jobs
by profpandit 3304 days ago
It's interesting to note that the architecture of the first ISO codec MPEG (1) is almost identical to the one we have today H.265 That codec was standardised in the late 90s So this design has carried through for about 20 years. Most of the changes relate to the targeted parameters such as frame size, frame rate and bitrate. Only the last step 264 --> 265 seems to have added new features.

This is a very well written introduction

4 comments

What? They’ve changed the transforms, the entropy coding, the way motion compensation works, changed the blocking strategy....what aspect is unchanged, actually?
These are all refinements. The broad strokes of the algo haven't changed
What could have changed?
That's a good research topic. Considering the time-frame involved -- 20 years you could infer that attempts at finding alternative strategies have been half hearted The way the ISO and standardisation works is the reason for this. 99 out 100 researchers work on the mainstream
Just curious as I never bothered to think about this before, in all H. Codecs/standards... what does the "H" stand for?
It's an ITU spec naming convention - think things like X.25. They're all <letter>.<digit>+. The letters aren't often very mnemonic. The letter is a large bucket classification, audiovisual and multimedia systems for H.

https://en.wikipedia.org/wiki/ITU-T

http://www.itu.int/en/ITU-T/publications/Pages/structure.asp...

Don't know, but it was associated with the ITU as opposed to the ISO before they decided to merge their efforts
It is not like it is not tried. Wavelet compression never took off. I don't know if it is because the format of is just better, or there was never enough investment into those formats.
The problem with Wavelet based compression was since wavelet transforms were applied globally to the whole frame at a time, while they were suitable for still image compression they couldn't really take advantage of motion compensation so their applicability for video was low. Same with fractal based techniques. Besides, as resolutions got higher and higher the blockiness of the 8x8 DCT became less and less a factor
Digital cinema uses wavelet compression - intra-frame only JPEG2000 at hundreds of Mbps. It seems that at high resolution and bitrate it actually performs similarly to or better than h264, e.g. this paper and its references: http://alumni.media.mit.edu/~shiboxin/files/Shi_ICME08.pdf
Digital cinema uses a resolution that is much higher than H.265's targeted sweet spot. Their quality needs are also a lot higher. Motion compensated video cannot give them the desired quality. Hence intra frame only wavelet based compression. Also, note that JPEG2000 which uses wavelets implies that for still images, wavelets can be made to work better. JPEG which preceded JPEG2000 was 8x8 DCT based.
You can still use wavelet compression to encode the residual from MC, but I think the biggest problem is performance: DCTs have been optimised far more than wavelet transforms.

Even in still-image compression, the difference is noticeable --- I have some high-resolution PDFs containing JPEG2000 scanned images, and they take significantly longer to render than the equivalent containing JPEG images.

Well, you could. But the way current schemes are structured, the motion compensation is done on a 16x16 macroblock basis Using an 8x8 DCT to clean up the 4 quadrants within a macroblock makes sense. But performing a global wavelet transform on a motion compensated difference image would mess you up at the boundaries of the macroblocks since you would potentially have discontinuities there. Of course it would be possible to devise a scheme that had a different approach to motion compensation like say a per pixel one that used optical flow. Also, not all macroblocks are encoded using motion compensation Even in P and B frames, some are encoded intra. You would lose that small optimisation in a wavelet based scheme.

Similar techniques as the ones used to optimise DCTs could be used for wavelets. There has just not been a demand. The standardisation effort tends to swamp out all alternatives once the decision to choose the algorithm has been made. There's a huge amount of momentum behind the standard which makes alternatives very hard to pitch. This is part of the reason why the same basic technique has been in prevalent use for 20 odd years and nothing else has come into play

Patent issues also didn't help for jpeg2000.
Where can one read about all the also-ran or proprietary codecs used in the 90's?

It truly was the dark ages of digital video with low frame rates and postage stamp sized windows.

Try the FAQ for comp.compression