Hacker News new | ask | show | jobs
by p0nce 4095 days ago
In most papers I've seen VP9 fared worse than HEVC: http://infoscience.epfl.ch/record/200925/files/article-vp9-s...
1 comments

See also http://iphome.hhi.de/marpe/download/Performance_HEVC_VP9_X26...

Both of those are much older encoders, of course (nearly 2 years old in the case of the PCS paper), and some of the settings they used were questionable, e.g., what HEVC calls "constant QP" actually varies the quantizer based on the frame type, while VP9 really uses the same quantizer, which can make a big difference on metrics. Talking to the authors at PCS, they re-ran results later with more relaxed QP settings, and VP9 got somewhat better, but still didn't catch up with HEVC (on metrics).

Keep in mind all of these results are from people who have spent significant time working in MPEG/ITU/JVT/JCTVC, and may have some inherent biases. Google's own results looked much better: http://ieeexplore.ieee.org/iel7/6729645/6737661/06737765.pdf... (sorry for the paywall, I'm not aware of a free version available online, tl/dr: 30.38% better rate than H.264, 2.49% worse than HEVC), but obviously come with their own set of biases.

I don't know how to explain the discrepancies in the two sets of results, but they at least demonstrate the magnitude of the differences you can obtain by varying how you do the testing.

That iphome paper used a random git checkout from the day the bitstream was frozen, and claimed it was an official release, which always struck me as somwhere on the stupid/devious spectrum.

The first actual VP9 release was about 6 months later and even then I dont think they'd done much tuning for speed. Some parts of Google get criticised for building open source products privately, but others that develop in the open have their openness abused.

I ran some experiments to investigate the discrepancy.

The largest differences were caused by:

1. A variable quantiser being used for HEVC, but not for VP9 (as you described)

2. Keyframes being forced every 30 frames for VP9 in the first paper

HEVC also had I frames added every 30 frames, but these were not IDR frames, meaning that B frames were allowed to use the information from the I frame in HEVC.

However, in VP9, true keyframes were forced every 30 frames. The way VP9 works this meant that every 30 frames it encoded both a new Intra frame, and a new golden frame.

Making both codecs use a true fixed quantizer and removing the forced intra frames made the results more like Google's own paper.

I guess the moral is to not force frequent keyframes when encoding with VP9.

At some point comparing highly efficient codecs with metrics like PSNR becomes meaningless, I'm not sure why so much emphasis is still put on it. Verification against human preference is the way forward.

I believe methods like A/B preference testing have been done for audio codecs for a few decades, idk why video didn't catch up.

In this respect I look forward to the work done by the Daala guys, seeing how they come from the highly successful OPUS audio codec seem to be very mindful of perceptual optimization.

>At some point comparing highly efficient codecs with metrics like PSNR becomes meaningless, I'm not sure why so much emphasis is still put on it.

Because (some) codec developers have put a lot of effort into optimising for it, and simple numbers are easier to market to people who haven't actually compared the codecs themselves. You can see how successful it is from the many posts in this thread that assume VP9 gives you higher quality at the same file size than H.264.

On2 codecs were known for relatively heavy blurring that improved PSNR stats but looked subjectively worse. Most comparisons I've seen indicate this is still the case with VP9. Of course it does depend a lot on the source material - some things make it more noticeable than others.

Thanks. Even with bias, Google results looks way more realistic to me than "100x slower and worse than H.264".