Hacker News new | ask | show | jobs
by tptacek 277 days ago
Both approaches revealed the same conclusion: Memory Integrity Enforcement vastly reduces the exploitation strategies available to attackers. Though memory corruption bugs are usually interchangeable, MIE cut off so many exploit steps at a fundamental level that it was not possible to restore the chains by swapping in new bugs. Even with substantial effort, we could not rebuild any of these chains to work around MIE. The few memory corruption effects that remained are unreliable and don’t give attackers sufficient momentum to successfully exploit these bugs.

This is great, and a bit of a buried lede. Some of the economics of mercenary spyware depend on chains with interchangeable parts, and countermeasures targeting that property directly are interesting.

3 comments

In terms of Apple Kremlinology, should this be seen a step towards full capability-based memory safety like CHERI ( https://en.wikipedia.org/wiki/Capability_Hardware_Enhanced_R... ) or more as Apple signaling that it thinks it can get by without something like CHERI?
IMO it's the latter; CHERI requires a lot of heavy lifting at the compile-and-link layer that restricts application code behaviors, and an enormous change to the microarchitecture. On the other hand, heap-cookies / tag secrets can be delegated to the allocator at runtime in something like MIE / MTE, and existing component-level building blocks like the SPTM can provide some of the guarantees without needing a whole parallel memory architecture for capabilities like CHERI demands.
> CHERI requires a lot of heavy lifting at the compile-and-link layer that restricts application code behaviors, and an enormous change to the microarchitecture.

Well, Apple already routinely forces developers to recompile their applications so if Apple wants to introduce something needing a compiler / toolchain update they can do that easily. And they also control the entire SoC from start to finish and unlike pretty much everyone else also hold an ARM Architecture License so they can go and change whatever they want in the hardware side as well.

To reiterate what I've said elsewhere, CHERI does not need a whole parallel memory architecture, there is just one that gets a slight extension over a non-CHERI/MTE system to include tags. But that is the same story as MTE, which also needs to propagate the tags in the memory system (and in fact, more tags, since we just need one bit per 16 bytes, whereas MTE needs 4 bits per 16 bytes in the common scheme).
> compile-and-link layer

Not to mention the dynamic linker.

Yeah you need a compiler, linker and OS. That's true of any security technology. CHERI may be more significant in that regard because it's a bigger rethink than just stuffing some extra metadata into the existing types, but it's not at all intractable. We, a research group, maintain CheriBSD, a "full-fat" port of FreeBSD to CHERI (Morello and CHERI-RISC-V), so to a big tech organisation it's a small investment. The cost to tech companies is not making it work, it's often much more boring business factors.
Homepage here:

  https://www.cheribsd.org/
which strangely doesn’t seem to link here:

  https://github.com/CTSRD-CHERI/cheribsd
MTE and CHERI are so different that it’s hard and maybe not even possible to do both at the same time (you might not have enough spare bits in a CHERI 128 bit ptr for the MTE tag)

They also imply a very different system architecture.

We actually have ideas for how to combine the two; see section C.5 of https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-987.pdf
Sure, I'm not suggesting that Apple might actually do both at the same time. They could however implement the less burdensome one now while intending to replace it with the the all-singing-all-dancing alternative down the line.
Gotcha. My point about different systems architectures makes me think it’s unlikely that you’d want to do that
> MTE and CHERI are so different that it’s hard and maybe not even possible to do both at the same time (you might not have enough spare bits in a CHERI 128 bit ptr for the MTE tag)

Why would you need MTE if you have CHERI?

Why would you need CHERI if you have working mitigations that don't demand a second bus?

I think it's two halves of the same coin and Apple chose the second half of the coin.

The two systems are largely orthogonal; I think if Apple chose to go from one to the other it will be a generational change rather than an incremental one. The advantage of MTE/MIE is you can do it incrementally by just changing the high bits the allocator supplies; CHERI requires a fundamental paradigm shift. Apple love paradigm shifts but there's no indication they're going to do one here; if they do, it will be a separate effort.

CHERI is deterministic.

That’s strictly better, in theory.

(Not sure it’s practically better. You could make an argument that it’s not.)

FWIW (I am a nobody compared to you; I didn't make FIL-C :) ) - I think that MIE/MTE are practically superior to CHERI.

I also think this argument is compelling because one exists in millions of consumer drives, to-be-more (MTE -> MIE) and one does not.

This is on the verge of pedantry - CHERI determinism isn't strictly true, garbage collecting abandoned descriptors is currently done asynchronously. Malicious code could attempt to reuse an abandoned descriptor before it is "disappeared". I think it might be possible to construct a synthetic situation where two threads operating with perhaps different privilege in the same address space (something CHERI can support!) have an IPC channel might be affected by the timing.

There is a section in the technical reports that talks about garbage collection.

I don't think CHERI is currently being used with different privileged threads in the same address space.

Second bus?
CHERI fundamentally relies on capabilities living in memory that is architecturally separate from program memory. You could do so using a bus firewall, but then you're at the same place as MIE with the SPTM.
Not saying you’d want both. Just answering why MTE isn’t a path to CHERI

But here’s a reason to do both: CHERI’s UAF story isn’t great. Adding MTE means you get a probabilistic story at least

True! On the flip side, MTE sucks at intra-object corruption: if I get access to a heap object with pointers, MTE doesn't affect me, I can go ahead and write to that object because I own the tag.

Overall my _personal_ opinion is that CHERI is a huge win at a huge cost, while MTE is a huge win at a low cost. But, there are definitely vulnerability classes that each system excels at.

I think the intra object issue might be niche enough to not matter.

And CHERI fixes it only optionally, if you accept having to change a lot more code

> This is great ...

That's Apple and here is Google (who have been at memory safety since the early Chrome/Android days):

  Google folks were responsible for pushing on Hardware MTE ... It originally came from the folks who also did work on ASAN, syzkaller, etc ... with the help and support of folks in Android ... ARM/etc as well.

  I was the director for the teams that created/pushed on it ... So I'm very familiar with the tradeoffs.
  
  ...

  Put another way - the goal was to make it possible to use have the equivalent of ASAN be flipped on and off when you want it.

  Keeping it on all the time as a security mitigation was a secondary possibility, and has issues besides memory overhead.

  For example, you will suddenly cause tons of user-visible crashes. But not even consistently. You will crash on phones with MTE, but not without it (which is most of them).

  This is probably not the experience you want for a user.

  For a developer, you would now have to force everyone to test on MTE enabled phones when there are ~1mn of them. This is not likely to make developers happy.

  Are there security exploits it will mitigate? Yes, they will crash instead of be exploitable. Are there harmless bugs it will catch? Yes.

  ...

  As an aside - It's also not obvious it's the best choice for run-time mitigation.
https://news.ycombinator.com/item?id=39671337

Google Security (ex: TAG & Project Zero) do so much to tackle CSVs but with MTE the mothership dropped the ball so hard.

This is a Daniel Berlin post explaining why Google didn't originally enable MTE full-time on Android. It explicitly acknowledges that keeping MTE enforcement enabled for everyone would block vulnerabilities.
Unfortunate Daniel Berlin did not push Google to invest in MTE for security specifically, like Apple has done now with EMTE (MTE v4?). I mean, AOSP is investing heavily in rewriting core components like Binder IPC in Rust for memory safety instead... They also haven't resurrected the per-app toggle to disable JIT in ART for Java/Kotlin apps (like DVM's android:vmSafeMode)... especially after having delivered on-device "Isolated compilation" but (from what I can tell) only for OS (Java/Kotlin) components.

AOSP's security posture is frustrating (as Google seemingly solely decides what's good and what's bad and imposes that decision on each of their 3bn users & ~1m developers, despite some in the security community, like Daniel Micay, urging them to reconsider). The steps Apple has been taking (in both empowering the developers and locking down its own OS) in response to Celebgate and Pegasus hacks has been commendable.

Google did invest in MTE. In fact you linked to some of their investments that ended up trickling down to Android. The problem is actually shipping this is hard and Google was not able to do it. No, "some in the security community" being loud does not mean it is ready to ship. Google identified several problems that they were not able to solve and thus did not ship it generally.
> Google identified several problems that they were not able to solve and thus did not ship it generally.

My lament is, Google did not push it through when it mattered as Apple here has (assuming FEAT_MTE4 is them solving similar problems to productize MTE for security).

> "some in the security community" being loud

Think the GrapheneOS authors deserve more respect. They aren't merely "loud", they shipped features that AOSP later incorporated.

No, FEAT_MTE4 is just part of it. There's a bunch of implementation work that goes on top of it to make it perform well for consumer devices.
Meanwhile Oracle has been doing it since 2015 with SPARC ADI on Solaris.

I do agree it is a pain not seeing this becoming widely adopted.

As for disabling JIT, it would have the same effect as early Androids, lagging behind Symbian devices, with applications that were wrappers around NDK code.

> As for disabling JIT, it would have the same effect as early Androids

DVM tried to mitigate the slowness with JIT+SSA, but ART mixed in JIT+SSA alongside AOT+PGO (that is, a no JITing ART means a full AOT ART, unlike in DVM where the Interp takes over when in vmSafeMode). Even if the runtime will continue to lag in terms of power/performance efficiency wrt ObjC/Swift, Google should at least let the developers decide if they want to disallow JIT from creating executable memory regions inside their app's sandbox, like Apple does: https://developer.apple.com/documentation/security/hardened-...

RIP Vigilant Labs

Okay a bit drastic, I don’t really know if this will affect them.

I think they're going to print money hats, but we'll see. Remember: there isn't a realistic ceiling on what NATO-friendly intelligence and law enforcement agencies will pay for this technology; it competes with human intelligence, which is nosebleed expensive.