Hacker News new | ask | show | jobs
by jordand 18 days ago
'AV2 decoding is roughly five times more complex than AV1 decoding. In practice, that means software running on today’s hardware will struggle to decode AV2 in real time without careful, architecture-specific optimization'

AV1 software decoding is already very intensive so AV2 decoding benchmarks are the next thing that would be really interesting (or mortifying) to see.

4 comments

> AV1 software decoding is already very intensive

I think you might be misunderestimating how incredible the dav1d AV1 decoder is. Not only does it require less total time than the reference decoder to decode the same video, but it can spread that out over far more threads. I was unable to watch 4k 60fps av1 video on my media center PC (it's from 2019, so predates hardware av1 decoding, and, well, the CPU was a little long in the tooth) until I switched to dav1d. With dav1d I am now able to watch 4k 60fps av1 using software decoding, and my machine uses 10% CPU while doing so. Really amazing piece of software.

With any luck, the dav2d 5x claim will hold true, and 10% CPU usage will scale to 50% CPU usage, meaning I'm still able to watch 4k 60fps video on my media center without a hardware upgrade. (that machine doesn't have hyperthreading, so 50% cpu is actually 50%, not 100% in a fancy suit)

dav2d author here - the 5x number is just where we currently are, it's not the theoretical limit. We're hopeful that a significant amount of the increase we observe in dav2d relative to dav1d is in math code, which should be easier to optimize using hand-written assembly or other algorithmic optimizations. If that holds true, the practical slowdown once everything has been optimized may be substantially less, possibly 2x.
Intel's Arc dGPUs were really compelling for dedicated AV1 encode and decode, especially the small form factor of some cards. You could even fit it as a secondary card in a PC dedicated to recording and encode workflows for OBS.

Hope we get a similar option with future lineups that support AV2, especially given how popular video creation and streaming are now.

Is there a compelling reason encoding needs to be done locally?

The point of encoding is to reduce downstream bandwidth for the viewer, and upstream bandwidth for the distribution network.

The content creator only needs to upload it once.

Yes.

An uncompressed 1080p, 60fps video with 24-bit color depth would need around 3Gbps to be streamed. And even if you don't need to stream it, that would still consume a sizeable portion of the write throughput of the fastest SSDs currently available; if you go up to 4K, you'd actually exceed that by a lot (not to mention, 1tb of storage would last for about 10 minutes of video).

Who is regularly watching uncompressed videos outside of production environments? That’s got to be a very small population.
The context was remotely encoding the video, which would require sending the uncompressed stream.
I think the context was intended to be "encoded in some fashion on the upload, just it not as AV2 until after the remote end does all of the transcoded variations". I.e. upload as 2x target bitrate AV1 once and distribute as 1x target bitrate AV2 1,000 times and you'll get the same quality without having to encode AV2 locally.

I've actually done a version this for some multi-system live AV at an event before. Between the main software mixer workstations at various fields in the event it was a dumb but simple encoding they could do in hardware at a high bitrate and then in the machine compositing for the livestream out it did AV1 software encoding to upload to the streaming site to minimize bandwidth requirement from the venue and maximize quality on the streaming site. We've since upgraded to hardware with AV1 encode though.

The practical downside is AV2 is only providing a 30% advantage over AV1. For the streaming providers their bandwidth costs are pretty cheap compared to revamping the transcoding infrastructure, so it'd probably only make financial sense once the remove end can do the most complex and quality encoding used and the rest are all simpler.

Using raw uncompressed bitrate is a bit disingenuous. How about comparing an older, widely supported codec like H.264 as a baseline?
If you compressed it with H.264, it wouldn't make much sense to send it remotely to be encoded with a better codec.
Why not? If h.264 is the best you can do with minimal resources, you can give it 5x the final bitrate and send it to a specialized/beefy encoding system to become something better.
> Using raw uncompressed bitrate is a bit disingenuous

It is not disingenuous given the context. Gp was responding to ggp's hypothetical:

>> Is there a compelling reason encoding needs to be done locally?

If you don't encode locally as the video is created, you either need to store RAW frames which takes enormous amount of storage, or you use a different format and suffer quality loss by transcending.
> you use a different format and suffer quality loss by transcending.

Compressing to AV1/h264/265 etc is really only done for the final version, but that doesn't mean that videos are stored in RAW format during editing, where it is very common to store frames locally in Apple ProRes, Avid DNxHD, or some other compressed format that's targeted towards professional editing.

Contrary to AV1 or whatever similar format which offer compression ratios of 1000x and more, these formats have a compression ratio of around 10x. They are very simple, and the quality loss is low enough that it doesn't matter. They also tend to store images with 30 bits per pixel instead of the 24 bpp that's normally used for streaming.

You’re not wrong but I do think it’s worth clarifying that any professional production with a budget, even a modest one, is generally being shot with a raw codec -> edited/colored with proxies -> rendered with the original raw codec where they compress for the final cut.

ProRes and the like are used for proxies or quick and dirty productions that are mostly shooting their look in camera because of a fast turnaround time. This is usually event work on a budget or something for social media.

Well yes? The platforms only accept certain resolution/bitrates and also most of America isnt running 1gig up. They're running 5-30 mbps up. So yeah they need to encode it.
> They're running 5-30 mbps up

Do you not have 98% high speed 5G coverage?

Data caps make that hard. While everyone likes to claim unlimited data, I'm not aware of any providers that don't have a heavy data user clause where they'll deprioritize your data if you're a top ~5% data user (usually somewhere over ~1TB/month).

You also will need _some_ sort of encoding locally before uploading, even if it's minimal, which could lead to issues when encoded again (although there are codecs available to minimize this).

Just over half the world’s smartphone users do (meaning almost half don’t), and certain countries/areas have way more coverage than others. And a massive number of people have limited data per month, which means it’s also a cost concern.

Leaner delivery is not just ethical, but it also makes better business sense.

Video calls & streaming.
this

for other cases, I can just wait more for my cpu/gpu/cloud to do the job

I came to post this as well. Until widespread, inexpensive hardware catches up to a 2018 codec, AV# will remain a niche ideal.
Hardly niche. My laptop isn't new and it has hardware AV1 decoding and encoding. My 10 year old iPhone 7 can play 1080p AV1 video in software for over 200 minutes with VLC. The iPhone 7 was released in 2016, a year and a half before AV1. The dav1d decoder is mighty.

Netflix uses AV1: https://netflixtechblog.com/av1-now-powering-30-of-netflix-s...

YouTube uses AV1. It's tough to be more mainstream than that.

Right click on a YouTube video and select Stats for Nerds. If your system is capable of it, chances are it will be playing back in AV1.

Most of the YouTube videos I watch these days are AV1 encodes. Sometimes it's in VP9 and occasionally it's H.264.

Supported is different from doing it well though. You do notice the performance hit even on TVs that playback YouTube videos on AV1.

Even on 1080p videos running on AV1 on 1x, the TV system bogs down and any kind of interaction has a variable 1-3s lag. On some TVs if you do 1.25x the TV automatically "downgrades" the resolution to 480p to avoid dropping frames.

I wish there was an option to still use VP9 / H.264 on those systems (even limited to 1080p).

More reason to never use the builtin stuff in a tv. Cheap sticks can handle decoding fine.
My TV lags out even when doing nothing. So I use it as a dumb panel and let another device handle the streaming and decoding. Also has the benefit of blocking LG from loading adverts all over the UI.
Youtube artificially limits the resolution, on mine if you cast the exact same video it doesn’t impose that limit and works fine.
Yeah I could imagine the AV1 codec sticking around for a very long while, even as a fallback for AV2. There's still hundreds of millions of people out there using old/cheap devices (especially in developing countries) where that battery drain from software decoding is a big problem, so AV2 would be nonviable.
Some of the early use of VP9 and AV1 was Netflix serving video to people in developing countries. Their metered bandwidth was more of a bottleneck than the CPU playback.
Same. Mostly AV1, sometimes VP9, and rarely h264.

What's missing mostly: live streams which are h264.

Currently, and I say currently, dav1d is so fast, no worries on that side.

> AV1 software decoding is already very intensive so AV2 decoding benchmarks are the next thing that would be really interesting (or mortifying) to see.

Yes, this is going to be fun to watch.