Hacker News new | ask | show | jobs
by gw2 440 days ago
A question to security experts reading this thread:

What is your opinion on deploying C++ codebases with mitigations like CFI and bounds checking? Let's say I have a large C++ codebase which I am unwilling to rewrite in Rust. But I:

* Enable STL bounds checking using appropriate flags (like `-DGLIBCXX_ASSERTIONS`).

* Enable mitigations like CFI and shadow stacks.

How much less safe is "C++ w/ mitigations" than Rust? How much of the "70% CVE" statistic is relevant to such a C++ codebase?

(I've asked this in an earlier thread and also in other forums, but I never really got a response that does not boil down to "only Rust is safe, suck it up!". It also doesn't help that every other thread about C++ is about its memory unsafety...)

5 comments

It helps, however there is also a culture mindset that is required.

Back in the old Usenet flamewars, C developers would to say coding in languages like Object Pascal, Modula-2, Ada,... was like programming with straightjacket, and we used to call them cowboy programming.

When C++ came into the scene with its improved type system, it seemed a way we could have the best of both worlds, better safety and UNIX/C like ecosystem.

However this eventually changed as more and more people started to adopt C++, and thanks to its C subset, many C++ projects are actually mostly C code compiled with a C++ compiler.

So hardned runtimes help a lot, as does using static analysers like clang tidy, VC++ analyse, Sonar, PVS Studio, Clion analysers, ....

However many of them exist for the last 30 years, I was using Parasoft in 1999.

The biggest problem is culture, thinking that such tools are only required by those that aren't good enough to program C or C++, naturally those issues only happen to others, we are good drivers.

STL bounds checking isn't bounds checking. Your code (or other libraries you use) can still have simple pointer arithmetic that goes outside bounds.

But the larger problem is that bounds checking (even ASAN) isn't as good as statically checking code. ie. Your code with bounds checking still crashes at run time, which can be a denial of service attack, whereas with static checking your code would never have compiled in the first place.

Nevertheless if you don't want to rewrite the world, then using these mitigations is much better than not using them. I would also add fuzzing to the mix.

DoS is vastly better than an RCE. And safe code can still panic.

But as you mention, unfortunately enabling bound checking in the STL wouldn't catch a lot of pointer manipulation.

It would still be better than the the status-quo.

Good question. If I had to bet I'd say something like half of the 70% would be prevented. Yeah it wouldn't really help with lifetime issues or type confusion but a huge proportion of that 70% is simple out-of-bounds memory accesses.

But don't forget lots of open source code is written in C and this barely helps there.

> something like half of the 70% would be prevented

Sure, but the other half are use-after-frees and those would not be exploitable anyway because of CFI and shadow stacks.

That is a very bold claim!
For the first one, a lot of this depends on how modern your codebase is. STL bounds checks work great (and have remarkably low overhead) if the vast majority of your code is working with standard library types. Maybe all of the code that might have been a c-style array in the past is now using std::vector, std::span, or std::array and so you've got built in lengths. Not perfect, of course, since you can still have all sorts of spatial safety issues with custom iterator implementations or whatever, but great.

But my hunch is that the vast majority of C++ codebases aren't using std::span or std::array everywhere because there is just a lot of much older code. And there's no comparable option for handling lifetime bugs.

Tools like CFI or hardware memory tagging or pointer authentication help, but skilled exploit creators have been defeating techniques like these for a while so they don't have the "at least I know this entire class of issue is prevented" confidence as bounds checks inserted into library types.

The general industry recommendation is "if you are starting something new that has security implications, please seriously explore Rust" and "if you have a legacy C++ codebase that is too expensive to port please seriously explore these mitigation techniques and understand their limitations."

My two cents, I'm wearing my exploit writer's hat, but my current day job is SWE on legacy/"modern-ish" C++ codebases.

> Enable STL bounds checking using appropriate flags

This rarely helps. Most of the nice-to-exploit bugs were in older codes, which weren't using STL containers. Or they are even just write in C. However, if enabling these flags do not hurt you, please still do as it does make non-zero contribution.

> Enable mitigations like CFI and shadow stacks.

Shadow stack is meh. CFI helps a bit more, however there's some caveats depending on which CFI implementation you are talking about, i.e. how strong is it, for example, is it typed or not? But in best case it still just makes the bug chain one bug longer and maybe completely kills some bugs, which isn't enough to make your exploits impossible. It just raises the bar (that's important too though). It also depends on what the specific scenario. For example, for browser renderer without sandbox / site-isolation etc, CFI alone makes almost no impact, as in this case achieving arbitrary R/W is usually easier than taking over $rip, and it's obvious you can do data-only attack to have UXSS, which is a serious enough threat. On the other hand, if it's a server and you are mainly dealing with remote attackers and there's inherently no good leak primitive etc, various mitigations soup could make real difference.

So, all in all, it's hard to tell without your project details.

> How much of the "70% CVE" statistic is relevant to such a C++ codebase?

Uh, I'd guess, half or more of that. But still, it just raises the bar.

First of all, thanks for your response.

> This rarely helps. Most of the nice-to-exploit bugs were in older codes, which weren't using STL containers.

While I agree with this, is not modifying those code to use STL containers much cheaper than rewriting into an entirely new language?

> Shadow stack is meh.

Are you referring to the idea of shadow stacks in general or a particular implementation of them?

> For example, for browser renderer without sandbox / site-isolation etc

I may be wrong, but I think you are referring to JIT bugs leading to arbitrary script execution in JS engines. I don't think memory safety can do anything about it because those bugs happen in the binding layer between the C++ code and JS scripts. Binding code would have to use unsafe code anyway. (In general, script injection has nothing to do with memory safety, see Log4j)

> Uh, I'd guess, half or more of that.

I mean, if you are after RCEs, don't CFI and shadow stacks halt the program instead of letting the CPU jumping to the injected code?

Now, let me get more specific - can you name one widespread C++ exploit that:

* would have happened even if the above mentioned mitigations were employed.

* would not have happened in a memory safe language?

All good questions.

> is not modifying those code to use STL containers much cheaper

That's right. However, I'd add that most exploited bugs these days (in high-profile targets) are temporal memory safety (i.e. lifetime) bugs. The remaining spatial (out of bound) bugs are mostly in long forgotten dependencies.

> Are you referring to the idea of shadow stacks in general or a particular implementation of them?

The idea. Shadow stack (assuming perfect hardware assisted implementation) is a good backward-edge control flow integrity idea, and ruins one of the common ways to take over $rip (write a ROP chain to stack), but that's it. Besides making exploitation harder, both forward-edge and backward-edge CFI also kill some bugs. However, IMO we are long past non-linear stack buffer overflow days, once in a while there may still be news about one, but it could be news because it is an outlier. Hence, compared to CFI, the bugs shadow stack kills are pretty irrelevant now.

> JIT bugs leading to arbitrary script execution in JS engines

Not necessarily JIT bugs. Could also be an UAF and people went a bloody path to convert it to an `ArrayBuffer` with base address = 0 and size = 0x7FFFFFFFFFFFFFFF accessible from JavaScript. Chrome killed this specific primitive. But there's more, I'm not going to talk about them here.

You may have a slight confusion here. In case of browser renderer, people starts with arbitrary JavaScript execution, the goal here is to do what JavaScript (on this page!) can't do, via memory corruption - including, but not limited to executing arbitrary native code. For example, for a few years, being able to access Chrome-specific JS APIs to send arbitrary IPC message to browser process (out of renderer sandbox), is one `bool` flag on .bss away from JavaScript. If we managed to get arbitrary R/W (that is, can read / write all memory within the renderer process, within JavaScript, see my ArrayBuffer example above), we just change it and run our sandbox escape against browser process in JavaScript, who needs that native code execution?

Or, if you do want native code execution. For a few years in V8 the native code WASM gets compiled to, is RWX in memory, so you just use your arb R/W to write that. You can kill that too, but then people starts coming up with bizarre tricks like overwriting your WASM code cache when you load it from disk and before making it R/X, and there're enough fishes in the pool that you likely can't patch em'all.

> I mean, if you are after RCEs, don't CFI and shadow stacks halt the program instead of letting the CPU jumping to the injected code?

Yeah. But as I said, nowadays people usually use temporal memory safety bugs, and they want arbitrary R/W before they attempt to take over $rip. Don't get me wrong, this is because of the success of CFI and similar mitigations! So they did work, they just can't stop people from popping your phones.

> can you name one widespread C++ exploit that:

I just google'd "Chrome in the wild UAF" and casually found this in the first page: https://securelist.com/the-zero-day-exploits-of-operation-wi...

I assume "in the wild exploited" fits your "widespread" requirement.

Granted, it's five years old, but if you are okay with non-ITW bugs I can come up with a lot of more recent ones (in my mind).

This is an UAF. So it would not have happened in a memory safe language. While back then the exploited chrome.exe may not have enabled CFG (it was enabled late 2021 IIRC), I don't see how the exploit path could be blocked by CFI.