Hacker News new | ask | show | jobs
by nolist_policy 1008 days ago
I think sandboxing is the more powerful solution. You think in terms of "What privileges can the attacker gain if this code blows up?" and limit the code's privileges to the minimum.

Problem is, sandboxing is harder to implement so it's often done suboptimally or not at all.

3 comments

Yes and Google Chrome has invested heavily in sandboxing and still had to ship this as a high-priority fix. I'd say sandboxing in conjunction with memory-safe languages is the future.
The problem is that rewriting existing code into a memory-safe language is a huge investment -- and realistically the world depends on a lot of code built over many decades that cannot be rewritten overnight. Consider that Mozilla created Rust specifically so they could rewrite their browser in it, yet still only a small fraction of Firefox code is Rust today -- much more is still C/C++. Realistically we're going to have a lot of heavily used C/C++ code forever. The singularity will come before we can replace it all.

The nice thing about sandboxing and fuzzing can be applied to existing code.

Yes, but sandboxing and fuzzing are insufficient. As pointed out in the article Google had been fuzzing this library and it didn't find the issue. They even tweaked the fuzzing after this issue was found to specifically target the area of the vulnerability and it apparently still didn't trigger the issue.

Google Chrome also implements sandboxing and many areas. It's not feasible everywhere. So for new code / libraries we should default to a memory-safe language.

I think almost everyone agrees new greenfield projects should not choose C/C++, now that Rust has matured enough and covers essentially the same use cases.

But realistically that only solves a tiny fraction of the problem, since realistically new greenfield projects started today will likely take 10+ years to become widely used, if they do at all.

WebP was written at a time when C/C++ was still the only viable language in which to write an image compression library. Saying "things like this should be written in Rust!" just doesn't actually do anything to make software like WebP secure. Improving fuzzers and sandboxing might.

> realistically new greenfield projects started today will likely take 10+ years to become widely used, if they do at all.

I'm pretty sure that even now a lot speaks against using rust for greenfield projects precisely for that reason: few people want to integrate Rust into their build chain. You basically always have to have a compiler that's 1 release old or otherwise you cannot compile new Rust software like the very widely used time crate.

If you are a new and unproven format, do you really want to bear that hit? You could make two implementations, one in C and one in Rust, but that will mean you spread your probably quite scarce engineers over two projects.

Heh. Well, speaking as the lead engineer of a large C++ systems project that sadly started a liiiiitle bit to early to use Rust (the Cloudflare Workers runtime), I'd say our attitude is the other way around: We basically refuse to bring in any kind of C/C++ dependency unless it's something used in Chrome (implying they've vetted it). We are much more willing to bring in Rust dependencies, even though they're harder for us to invoke due to the language barrier.
You can write your implementation in Rust and expose a C ABI, and anyone using your library doesn't have to know or care that the internals use Rust. That's pretty much the whole point of Rust, otherwise you'd just use OCaml.

  You basically always have to have a compiler that's 1 release
  old or otherwise you cannot compile new Rust software like the
  very widely used time crate.
That's an interesting example as the time crate stagnated for quite a while and even the current version only requires Rust 1.67 or newer. The current Rust version is 1.72.
A rewrite of a webp decoder into Rust already exists (image-rs has a decoder for webp). I'm sure there is some polishing needed but it's nothing a company the size of Google can't afford.
Unfortunately, the readme for image-rs indicates that webp support is provided via the libwebp C library!
That's for the encoding part. The 0day we are talking about was in the webp decoder, and decoders generally are the riskier component compared to encoders. I'm not sure if Chrome ships with the webp encoder at all.
Only for encoding.
Mozilla has an interesting sandboxing strategy: They compile some of the C/C++ parts of Firefox (e.g. the ogg parser) to Wasm and then back to C. Because Wasm is memory safe, the resulting C is too.
Second problem: sandbox aren't perfect either. It's indeed useful, as part of a defense-in-depth approach, but it's far from sufficient.

Memory safety could solve the problem altogether, but then again no program is 100% memory safe, there's always some kind of primitive that uses memory-unsafe code under the hood, so it's not perfect either.

The “perfect” solution would probably be:

- use memory safe languages

- all primitives using memory unsafe stuff should get formally verified

Rust is kind of aiming at this (with things like [1] and [2]), but it's not there yet.

[1]: https://dl.acm.org/doi/pdf/10.1145/3158154 [2]: https://github.com/rust-secure-code/safety-dance

This again.

Strong type systems can give provably correct code. For trusted code (e.g. not third party code), sandboxing is a post-exploit mitigation. And such a post-exploit mitigation cannot necessarily guard against any class of bugs that (at least in some aspect) provably correct code can.

Yes, of course privilege separation as much as possible is still extremely valuable, but to say that sandboxing is a "better" solution, implying that one should not pursue provable correct code in favor of post-exploit mitigation, is a harsh liability. It's the same as the "oh, we don't need to use a type safe language, we have unit tests"-crowd, only worse.

> Strong type systems can give provably correct code.

"Sound" [1] type systems only guarantee the absence of some class of bugs as well. There are a lot of bug classes that remain exploitable. Memory safety happens to be a low-hanging fruit because many existing softwares are not even written in such languages.

[1] "Strong" type systems generally refer to the intolerance towards implicit type conversions or memory unsafety, and that alone doesn't make type systems provably safe in some sense.

I did not claim that everything about the code was proven correct, but a subset of properties expressable by the type system.

I agree that using the word “strong” was wrong. I basically meant it in the sense of “good”/“elaborate”, mistakenly ignoring that “strong” already has a very specific meaning in type systems. Thanks for correcting.

> Strong type systems can give provably correct code

^ I think this is where you seemed to claim that everything about the code was proven correct.

> Strong type systems can give provably correct code.

In theory, perhaps. In practice, the compiler / runtime will have bugs. So you still need a sandboxing layer. Best to do both.

(The sandbox will have bugs too. But if you have two layers, hopefully it's hard for attackers to get ahold of zero days for both at the same time...)

Yeah, anything can be implemented incorrectly. My point was specifically that ignoring memory safe languages in favor of post-exploit mitigation is foolish, not that the latter is useless.

But in practice, type systems can be proven to be sound, implementations of type checkers can be proven correct, and while it’s still possible to make mistakes there, that isn’t so much an issue in mathematics, simply because of how rigorous it is. This does not just include type systems, there’s even an effort to rebase the foundation of mathematics on a type system instead of set theory!

In other words, we likely agree that steel vaults can have holes. But we probably also agree that an average steel vault is better in keeping things in or out of it than a velvet curtain.

> In theory, perhaps. In practice, the compiler / runtime will have bugs.

They probably won't. The trusted kernel for these systems is tiny; a sandbox is orders of magnitude more complex with orders of magnitude more chances for bugs to creep in.

What does provably correct mean here? I think you mean that the code doesn’t have any memory corruption vulnerabilities. However, that is only one class of vulnerability, so more techniques then just relying on a memory safe language are required for secure software.
It means that the type system can prove certain properties for you. For example, in languages with dependent type systems like Agda, you can construct a sorted list type that the compiler will prove it sorted at all times, otherwise it won’t compile. Or a complex tree type that is always balanced. Or a set of only even numbers, and again it won’t compile otherwise…

(Sadly, if you go that far, it isn’t generally Turing complete anymore. Though in some cases that’s a good thing.)