Hacker News new | ask | show | jobs
by cogman10 14 days ago
Encoder and decoder writers frequently need extremely fine grain control over SIMD instructions in order to get good performance.

The way they weave these instructions can be very hard to express with a high level language.

Further, there's a ton of work with arrays and importantly parts of arrays. They can, for example, need to extract every other element up to 1/2 the array. Unfortunately, rust has runtime array bounds checks which make writing that sort of code slower. The compiler can elade those checks, but usually only in simple cases.

The authors would be writing a bunch of unsafe rust to get the performance they want and rust makes that more painful on purpose.

I like rust, but C/ASM really is the right choice here. This is one of the few cases where rust's safety is a major detriment.

1 comments

Performance should not be priority #1. Security should be. Why do we slow down all CPUs to prevent SPECTRE attacks yet continue to write in C? As rav1d shows, the perf loss is far less to migrate from C to Rust than it is to apply SPECTRE mitigations, and adding a sandbox around a memory-unsafe codec is going to be way more expensive again than using Rust code to start.
> Performance should not be priority #1. Security should be.

For a web browser, or a server in a bank, sure. For anything else, questionable.

> adding a sandbox around a memory-unsafe codec is going to be way more expensive

In modern world, overhead of strong sandboxes is surprisingly small. A nuclear but most reliable option is hardware assisted VM. On modern computers with SLAT and virtualized IO the overhead for most use cases is negligible. If you want something lighter weight, can use a multi-user nature of all modern OS kernels and isolate into a separate process with restricted permissions. Sandboxing overhead is approximately zero.

> As rav1d shows

rav1d is not a full rewrite of dav1d to rust. So it really doesn't show that. It's currently C + rust + asm.

I don't think we can say anything about what this does or does not prove about the performance of safe code.

> Performance should not be priority #1. Security should be.

Entirely depends on the application. The reason rust has `unsafe` is because there's some situations where performance needs to preempt potential security problems.

Codecs are difficult and expensive to develop. Therefore they get reused in many contexts, including security critical ones. Sandboxing is shown over and over to not be a great security solution, so what this means in practice is that security-critical software that needs software decoding get pwned because software engineers don't care to prioritize it in the first place.

Why shouldn't safety be the default? If you really want to, it wouldn't be too hard to maintain a patch on top of rustc to drop the bounds checks if you want to compile object files without them.

Software decoding has a safety culture problem, and we need to talk about it.

> Why shouldn't safety be the default?

Because safe code isn't fast enough to decode live video.

> If you really want to, it wouldn't be too hard to maintain a patch on top of rustc to drop the bounds checks if you want to compile object files without them.

Yeah, but then you are undermining safety in a critical way that does lead to security vulnerabilities (buffer overflow). And you are also now maintaining and requiring other devs for a project to use a custom version of rustc. That's certainly part of the reason that's simply not happened.

But another major part of it is that encoders end up with a lot of custom ASM regardless. That custom ASM is going to be where vulnerabilities end up. You don't really escape that by using rust.

If you are already abandoning where you critically need safety the most for performance, then why pick a language that additionally penalizes you for using unsafe constructs?

> Software decoding has a safety culture problem, and we need to talk about it.

Compilers and languages have an optimization problem that we need to talk about. SIMD optimizations remain a very hard thing for compilers to get right. We should talk about what it'd take to make compilers better and the reasons for why codec devs need to drop down to asm instead of using a high level compiler.

There might not be a solution to this problem, there are reasons for it.

> Because safe code isn't fast enough to decode live video.

I strongly doubt that.

And if any implementation of AV2 can be "fast enough", then there should be no question at all that we can write "fast enough" safe decoders for every other codec. Absolutely no way safe code is inherently that much slower.

Show me the AV1, H.265, or even H.264 decoder that doesn't ultimately rely heavily on hand written assembly to achieve "fast enough".

You can doubt all you like. Ultimately, there's a reason why dav1d includes hand coded SIMD for common platforms.

It's simply impossible to get a compiler to emit something like this [1].

[1] https://github.com/videolan/dav1d/blob/master/src/x86/ipred_...

What's supposed to be the big source of unsafety in codecs though? Feels like the problem here is that C developers are ruining the reputation of C with their garbage code.

Bounds checking as a source of slowdown is overrated in a niche where you're working on fixed size blocks. It feels like the C developers are getting the parts outside the ASM kernels wrong.

> What's supposed to be the big source of unsafety in codecs though?

Hand written assembly. It's quite easy to accidentally start reading or manipulating a block of memory you didn't intend to when doing complex SIMD transformations.

> Bounds checking as a source of slowdown is overrated in a niche where you're working on fixed size blocks.

I think you don't really understand how codecs work. It is not uncommon for a transformation like `a = b[c[i] * 3 + offset];`. There's no way for a compiler to omit the bounds check because it can't prove the contents of `c` aren't going to exceed the bounds of `b`.

This isn't a "crappy C developer" problem. This is a "There isn't a language that does a great job at capturing high level SIMD expressions" problem.