Hacker News new | ask | show | jobs
by ENOTTY 1472 days ago
Having grokked the abstract, I feel like can speculate a bit as to what is going on. Take this with a grain of salt; I have no clue what has actually been discovered.

I believe that the researchers have found a way to remove PAC as a barrier to exploitation by disclosing PAC verification results via speculative execution. This is only useful to attackers going after a target that uses PAC, and those attackers will need to have another vulnerability that enables them to hijack control-flow through modifying pointers to code that are located in memory.

The attackers can use this new Pacman vulnerability as a crash-free oracle that says whether their forged pointer worked, and once they find a working one, they can use that to hijack control flow.

PAC (or Pointer Authentication) is a security feature found in recent iPhones, the Apple Silicon Macs, and the Graviton3. It is intended as a defense against control-flow hijacks. It works by signing pointers found in memory with one of five keys that are known only to the processor. Before the pointer is used, the processor should be instructed to "authenticate" the pointer by checking the pointer's signature using its private keys. To prevent simple reuse of one authenticated pointer used in one place to a pointer used in another place in the program, code can provide a "context" value to be used during the authentication.

A great resource for learning about PAC and its usage in the Apple platforms is at [1] (it links to other resources) and if you want to play with a PAC enabled binary, check out [2]

[1]: https://googleprojectzero.blogspot.com/2019/02/examining-poi...

[2]: https://blog.ret2.io/2021/06/16/intro-to-pac-arm64/

EDIT: The attack works by:

1) Place your guess such that it is used as the pointer input to an authentication instruction

2) Causing a branch misprediction. On the not-taken side of the branch, code needs to perform a pointer authentication and usage of the pointer. On the taken side of the branch, code should not crash.

3) CPU speculatively executes down the not-taken side of the branch (misprediction) and speculatively executes the authentication instruction.

4) If your guess is correct, the authentication instruction will return a valid pointer. If your guess is incorrect, the authentication instruction returns a pointer that, if dereferenced, will cause an exception.

5) CPU speculatively executes a load (in the case of a data pointer) or an instruction fetch (in the case of a code pointer) on the pointer value.

6) If the pointer is valid, the address translation for that pointer will appear in the TLB. If the pointer is not valid, it will not (because of the exception).

7) All of the effects from this mispredicted branch get squashed when the CPU realizes that the branch is not taken. No exception is actually thrown!

8) Measure the TLB entries to determine whether the speculative address translation made it in. If it is present, you know that the guess is correct.

9) Repeat, up to 2^16 times.

3 comments

You hit the nail right on the head! That's exactly what we did :)
Apparently they haven’t fixed it yet, so a hardware solution may in fact not be possible, but is there any reason to believe it couldn’t be patched in “microcode”?

Who can guess at the performance impact, but one could imagine a configurable mechanism capable of disabling speculation past a PAC authentication.

Does the m1 even have microcode ?
I’d be shocked if a modern CPU didn’t have some kind of “firmware” to respond to errata.
Amazingly articulate writing, I think you have a second career as a tech writer if you ever wanted
Hahaha thank you. I am humbled by your praise
And this can really not be fixed in any way? Not trolling, happy to barely understand this in the first place
Until it has been fixed in hardware I think it could be mitigated in software a bit, but at a cost. A PAC signature can include also a 64-bit "context" value, which you could make unique per pointer (like a nonce).

However, context values are not something that is supported by any C ABI: the PAC extension contains also instructions that hardcode the context value to zero, and I would guess that those are what the kernel is using currently. To make use of context values, I suppose you would have to use a new ABI that stores effectively 128-bit pointers, and which also creates the random nonces/keys/whatyoucallthem to store in them.

I do not understand how you want to mitigate this issue by using the "context" given that the attack demonstration is done with a source code that makes use of the "context". The attack is fully context-agnostic since the "PACMAN gadget" in victim's code is injecting the "context" by itself.

The root of the problem is the small hash size and the fact that you can "suppress" failed hash check effects to bruteforce the hash. (it's expected that a failed hash check will cause a crash, which was intended to prevent bruteforcing)

Speculative exec ruins everything again.

I get the performance gainz, but when are we going to get past the formal fallacy that executing any instruction we don't need to based on actual flow is de facto a complete violation of user expectations and therefore completely unsafe to do.

Like every lay person I explain speculative execution seems to be able to recognize that a pipeline stall to figure out what a value actually is just the way to go.

Hell, my personal sanity check with computing is that there must exist a humans only implementation that correlates to a good computing primitive.

Nowhere on Earth, will you find an organization that will execute both sides of a conditional process requiring hunans to do the work just to throw away the result. Not taken.

Oh wait... Finance does it with Hedges...

Frigging finance. Ruins everything for everyone.

> Nowhere on Earth, will you find an organization that will execute both sides of a conditional process requiring hunans to do the work just to throw away the result. Not taken.

Speculative execution today (within modern high-end processor) does not execute both sides of a conditional branch.

It would indeed be a waste of power and it would be a much more complex micro-architecture.

Modern speculative OoO processors execute a single path and simply relies on the branch predictor accuracy. And they are pretty accurate, on the order of 3 misspredictions every 1000 instructions. The power consumption in unnecessary work due to a missprediction is quite low.

Modern processors consume much of their power in Out-of-Order instruction scheduling.

Looks like I need to go through Patterson and Hennessy again.

As I was fairly sure that as many computations were done as possible in the same cycle with the ditching of "not the case" results on subbsequent cycles, but I'll be the first to admit I haven't synced on the bleeding edge lit recently. And sipping power has become much more of a concern in recent years, do I may be due for a refresh anyway.

My favorite past time if the statistics are to be believed.

C pointers don’t typically use the context very much, but C++ uses it heavily.
Personally, I believe it can be fixed via key rotation (e.g. there are 3 inputs to the PAC algorithm. The pointer, a "modifier", and a "key" e.g. APIAKey_EL1).

I would have added that as a potential mitigation in the mitigations section. I think, say, changing the key every so often would be a reasonable task for a kernel to do, especially in the timeframe that this was exploited (about 3 minutes)

Hi! This is an interesting idea. However, there is a problem that arises- if you rotate the key, then old pointers now become invalid. And since the kernel is always alive and servicing requests (and contains structures with very long lifetimes) we don't believe this to be a practical solution.