Hacker News new | ask | show | jobs
by londons_explore 3092 days ago
I can't really see how it would be fixable even with new hardware.

Speculative execution is fundamental to getting decent performance out of a CPU. Without it you should probably divide your performance expectations by 5 at least.

Rolling back all state rather than just user visible state in the CPU is neigh on impossible. When you evict something from the cache, you delete it. Undeleting is hard. There are also a lot of other non-user-visible bits of state in a CPU.

3 comments

I agree that we'll probably see new attacks in this area for a long time.

That said, the main new ingredient of Spectre seems to be the idea that userspace can poison the branch target buffer to cause speculative execution of arbitrary code in kernel space. That part of the attack should be fairly easy to mitigate with new hardware, by XORing (or hashing) the index into the BTB with a configurable value that depends on the privilege level. So each process has its own "nonce", and they're all different from the kernel's.

Then BTB poisoning won't work unless the attacker knows its own and the other context's nonce. Even if further attacks are found that leak this nonce, they could be mitigated by changing the nonce at regular intervals.

Couldn't you do something like have a separate chunk of "speculative cache" which you only commit to the main cache once the speculatively-executed instructions are retired? Sounds complex, sure - but it seems like that would give you the performance benefits of speculative execution while still being able to roll back (or prevent in the first place) any cache-state side effects when branches were mispredicted. Could also imagine processors start segregating cache by privilege level.

I guess part of the question you're raising is: are there so many different caches, translation buffers, etc. in a modern CPU that keeping 'uncommitted buffers' for the state of all of them would be just as complex as throwing a whole other core in there?

No, that would not be enough. CPUs speculatively execute across multiple branches. Even if you had a separate speculative cache for every code path, you could still build a side-channel from the amount of contention. [1]

[1] https://eprint.iacr.org/2016/613.pdf

> Both hardware thread systems (SMT and TMT) expose contention within the execution core. In SMT, the threads effectively compete in real time for access to functional units, the L1 cache, and speculation resources (such as the BTB). This is similar to the real-time sharing that occurs between separate cores, but includes all levels of the architecture. [...] SMT has been exploited in known attacks (Sections 4.2.1 and 4.3.1)

Possibly - but that's an entirely new processor design. That would take years to get released an adopted.

The scary thing is that you can't fix this in software.

It effectively means wiping the caches, TLBs, BTBs and any other caches and optimisations on any form of context switch, as far as I can see? Which yes will likely require new silicon.