Hacker News new | ask | show | jobs
by leoc 279 days ago
In terms of Apple Kremlinology, should this be seen a step towards full capability-based memory safety like CHERI ( https://en.wikipedia.org/wiki/Capability_Hardware_Enhanced_R... ) or more as Apple signaling that it thinks it can get by without something like CHERI?
2 comments

IMO it's the latter; CHERI requires a lot of heavy lifting at the compile-and-link layer that restricts application code behaviors, and an enormous change to the microarchitecture. On the other hand, heap-cookies / tag secrets can be delegated to the allocator at runtime in something like MIE / MTE, and existing component-level building blocks like the SPTM can provide some of the guarantees without needing a whole parallel memory architecture for capabilities like CHERI demands.
> CHERI requires a lot of heavy lifting at the compile-and-link layer that restricts application code behaviors, and an enormous change to the microarchitecture.

Well, Apple already routinely forces developers to recompile their applications so if Apple wants to introduce something needing a compiler / toolchain update they can do that easily. And they also control the entire SoC from start to finish and unlike pretty much everyone else also hold an ARM Architecture License so they can go and change whatever they want in the hardware side as well.

To reiterate what I've said elsewhere, CHERI does not need a whole parallel memory architecture, there is just one that gets a slight extension over a non-CHERI/MTE system to include tags. But that is the same story as MTE, which also needs to propagate the tags in the memory system (and in fact, more tags, since we just need one bit per 16 bytes, whereas MTE needs 4 bits per 16 bytes in the common scheme).
> compile-and-link layer

Not to mention the dynamic linker.

Yeah you need a compiler, linker and OS. That's true of any security technology. CHERI may be more significant in that regard because it's a bigger rethink than just stuffing some extra metadata into the existing types, but it's not at all intractable. We, a research group, maintain CheriBSD, a "full-fat" port of FreeBSD to CHERI (Morello and CHERI-RISC-V), so to a big tech organisation it's a small investment. The cost to tech companies is not making it work, it's often much more boring business factors.
Homepage here:

  https://www.cheribsd.org/
which strangely doesn’t seem to link here:

  https://github.com/CTSRD-CHERI/cheribsd
MTE and CHERI are so different that it’s hard and maybe not even possible to do both at the same time (you might not have enough spare bits in a CHERI 128 bit ptr for the MTE tag)

They also imply a very different system architecture.

We actually have ideas for how to combine the two; see section C.5 of https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-987.pdf
Sure, I'm not suggesting that Apple might actually do both at the same time. They could however implement the less burdensome one now while intending to replace it with the the all-singing-all-dancing alternative down the line.
Gotcha. My point about different systems architectures makes me think it’s unlikely that you’d want to do that
> MTE and CHERI are so different that it’s hard and maybe not even possible to do both at the same time (you might not have enough spare bits in a CHERI 128 bit ptr for the MTE tag)

Why would you need MTE if you have CHERI?

Why would you need CHERI if you have working mitigations that don't demand a second bus?

I think it's two halves of the same coin and Apple chose the second half of the coin.

The two systems are largely orthogonal; I think if Apple chose to go from one to the other it will be a generational change rather than an incremental one. The advantage of MTE/MIE is you can do it incrementally by just changing the high bits the allocator supplies; CHERI requires a fundamental paradigm shift. Apple love paradigm shifts but there's no indication they're going to do one here; if they do, it will be a separate effort.

CHERI is deterministic.

That’s strictly better, in theory.

(Not sure it’s practically better. You could make an argument that it’s not.)

FWIW (I am a nobody compared to you; I didn't make FIL-C :) ) - I think that MIE/MTE are practically superior to CHERI.

I also think this argument is compelling because one exists in millions of consumer drives, to-be-more (MTE -> MIE) and one does not.

This is on the verge of pedantry - CHERI determinism isn't strictly true, garbage collecting abandoned descriptors is currently done asynchronously. Malicious code could attempt to reuse an abandoned descriptor before it is "disappeared". I think it might be possible to construct a synthetic situation where two threads operating with perhaps different privilege in the same address space (something CHERI can support!) have an IPC channel might be affected by the timing.

There is a section in the technical reports that talks about garbage collection.

I don't think CHERI is currently being used with different privileged threads in the same address space.

I suspect that the parent poster was referring to MTE's memory protection being probabilistic. There are only 16 tag values for an attacker to guess. You can combine MTE and PAC, but PAC is also only probabilistic.

With CHERI, there is nothing to guess. You either have a capability or you don't.

Second bus?
CHERI fundamentally relies on capabilities living in memory that is architecturally separate from program memory. You could do so using a bus firewall, but then you're at the same place as MIE with the SPTM.
That's not true. Capabilities are in main memory as much as any other data. The tags are in separate memory (whether a wider SRAM, DRAM ECC bits, or a separate table off on the side in a fraction of memory that's managed by the memory controller; all three schemes have been implemented and have trade-offs). But this is also true of MTE; you do not want those tags in normal software-visible main memory either, they need to be protected.
A CHERI capability is stored in main memory but with the tag bit for that location set. The tags are stored in separate memory pages, also in main memory in current designs.

Maybe you've been confused by a description of how it works inside a processor. In early CHERI designs, capabilities were in different architectural processor registers from integers.

In recent CHERI designs, the same register numbers are used for capabilities and other registers. A micro-architecture could be designed to have either all registers be capability registers with the tag bit, or use register renaming to separate integer and capability registers.

I suppose a CHERI MCU for embedded systems with small memory could theoretically have tag pages in separate SRAM instead of caching main memory, but I have not seen that.

So something like having built in RAM for the pagetables that aren’t part of the normal pool? That way no matter what kind of attack you come up with user space cannot pass a pointer to it?
Not saying you’d want both. Just answering why MTE isn’t a path to CHERI

But here’s a reason to do both: CHERI’s UAF story isn’t great. Adding MTE means you get a probabilistic story at least

True! On the flip side, MTE sucks at intra-object corruption: if I get access to a heap object with pointers, MTE doesn't affect me, I can go ahead and write to that object because I own the tag.

Overall my _personal_ opinion is that CHERI is a huge win at a huge cost, while MTE is a huge win at a low cost. But, there are definitely vulnerability classes that each system excels at.

I think the intra object issue might be niche enough to not matter.

And CHERI fixes it only optionally, if you accept having to change a lot more code

Where studies suggest "a lot" is sub-0.1%. For example, https://www.capabilitieslimited.co.uk/_files/ugd/f4d681_e0f2... was a study into porting 6 million lines of C and C++ to run a KDE+X11 desktop stack on CHERI, and saw 0.026% LoC change, or ~1.5k LoC out of ~6 million LoC, all done in just 3 months by one person. That's even an overestimate, because it includes many changes to build systems just to be able to cross-compile the projects. It's not nothing, but it's the kind of thing where a single engineer can feasibly port large bodies of code. Yes, certain systems code will be worse (like JITs), but the vast majority of cases are not that, and even those are still feasible (e.g. we have people working with Chromium and V8).
I think I broadly agree with you. IMO tagging is practically much, much more valuable than capabilities systems modeled like CHERI.