Hacker News new | ask | show | jobs
by matt4077 2895 days ago
Yes, of course. Anything else would be DOA.

On a general note: there really seems to be an extremely inaccurate narrative regarding AV1 and speed taking hold. I can't understand why it isn't easier understood that a reference implementation is about accuracy only, completely ignoring performance considerations. Not in the usual "we'll now try to make it faster", but as in "this is never meant to be used in production, and it's performance is in no way indicative of the performance optimised encoders will see".

As but one example: media encoding is pretty close to being "embarrassingly parallel" in principle, making the first three orders of magnitude easy wins for a straightforward GPU implementation.

4 comments

Video compression engineer here.

> I can't understand why it isn't easier understood that a reference implementation is about accuracy only, > completely ignoring performance considerations.

Because the official codebase conveys another message. Have a look, there are SIMD implementations for almost all supported targets.

https://aomedia.googlesource.com/aom/+/av1-normative/aom_dsp... https://aomedia.googlesource.com/aom/+/av1-normative/aom_dsp... https://aomedia.googlesource.com/aom/+/av1-normative/aom_dsp... ....

What are these files for, if not performance? They've been maintained and kept synchronized with the reference C code during the whole project, long before the codec was frozen (and it was a huge PITA).

This doesn't look like "completely ignoring performance considerations".

> As but one example: media encoding is pretty close to being "embarrassingly parallel" in principle,

Almost all video codecs exploit some block-level encoding context, which means the way you encode one block depends on how the previous neighbooring blocks were encoded. This creates a huge dependency between blocks. There are tools like slicing/tiling that allow you to break these dependencies, and thus, encoding in parallel, but at the cost of video quality. Making the problem "embarrassingly parallel" at this point would make the video "embarrasingly ugly".

You could encode multiple frames in parallel ; but then again, being able to encode them independently means you're basically trashing all the compression context (reference frames), and your video quality goes down the tubes.

In an offline encoding scenario (Netflix, Youtube), if you have lots of memory, you can encode multiple independent video sequences from the same movie. Making the problem "embarrassingly parallel" in this case would require an "embarrassingly huge" amount of memory. Also, it's not applicable to a live scenario (think: latency).

> media encoding is pretty close to being "embarrassingly parallel" in principle

My understanding is that there are some fairly tight feedback loops in the encoders that make it difficult to offload things to the GPU, at least if you want to maximize the quality per byte metric. If you want to target realtime and don't need optimal compression it probably gets easier.

> As but one example: media encoding is pretty close to being "embarrassingly parallel" in principle

Which part? 90% of what you're doing is context or inter-frame dependent. Video encoders that live on graphics cards today use dedicated ASIC hardware.

You can divide the video into chunks and encode the chunks in parallel. This is what Netflix does:

https://medium.com/netflix-techblog/high-quality-video-encod...

https://medium.com/netflix-techblog/dynamic-optimizer-a-perc...

Works well when you're doing video at the scale of Netflix, but not necessarily much help to the individual user who just wants to encode a video.

> You can divide the video into chunks and encode the chunks in parallel

You can do this with zlib too (zlib divides a file up into 64k chunks). Doesn't mean that zlib is well-suited for GPUs, nor is each chunk "embarrassingly parallel". Neither Netflix post talks about using the GPU at all.

> You can divide the video into chunks and encode the chunks in parallel.

What about live encoding?

People are pragmatic, at least in this regard. They don't really suffer from the bandwidth costs, they want fast encode speeds for offline storage.

And they are simply cautious. They don't really care about the hype x264 is good enough visually, now all visual comparisons are done on ridiculously low bitrate (which is a good thing, but people don't really care).