After reading that thread, I sort of wonder if this is the catalyst for the next tech bust. Prices on the basic building block of the modern tech industry (a server shard) going up 30%, or even more as shared/virtual services must be decommissioned for isolation? Surely it’s an alarmist thing to think and I don’t think it’s likely, but if you asked me yesterday the likeihood of an underlying security vulnerability effecting every processor since 1995 I’d have said probably not.
Major props to the teams working on this... now time for us all to hold onto our pants as we ask for budget increases that will make shareholders demand blood.
The only sorts of companies where server costs could increase hugely due to a sudden need for hardware isolation are those where they're running tiny or incredibly bursty workloads. Big companies like Netflix that use tons of cores can just binpack their work all together on the same hardware so their jobs only share hardware with other jobs controlled by the same company. Effectively, cloud providers will start offering sub-clouds into which only your own jobs will be scheduled.
This is actually how cloud tech has worked for many years internally. I worked at Google for a long time and their cluster control system (Borg) had a concept called "allocs" which were basically scheduling sub-domains. You could schedule an alloc to reserve some resources, and then schedule jobs into the alloc which would share those resources. Allocs were often used to avoid performance-related interference from shared jobs, e.g. when a batch job kept hogging the CPU caches and slowing down latency sensitive servers. I suppose these days VMs and containers do a similar job, though I think the Borg approach was nicer and more efficient.
I guess this sort of per-firm isolation will become common and most companies costs won't change a huge amount. The people it'll hit will be small mom-and-pop personal servers, but they're unlikely to care about side channel attacks anyway. So I wouldn't sell stock in cloud providers just yet.
Yes, from my understanding, Spectre is an architectural-level flaw in the so-called speculative execution unit. In other words, Spectre will only be fixed once Intel, AMD, and ARM redesign the unit and release new processors. Given the timelines of CPU design, this will take 5-10 years at least.
On the positive side, the flaw is very difficult to exploit in a practical setting.
> On the positive side, the flaw is very difficult to exploit in a practical setting.
Is it?
"As a proof-of-concept, JavaScript code was written that, when run in the Google Chrome browser, allows JavaScript to read private memory from the process in which it runs"
There are possible mitigations for cloud providers:
1) pay $x / hour and run on shared machine with possibility of an attack;
2) pay $y / hour (where x < y) and run all your processeses on dedicated machines without anybody else.
Moreover the option 2) already exists for large customers and security sensitive applications (e.g. CIA dedicated cloud built by Amazon).
Amazon instances can be created with the dedicated flag. The host hardware will be dedicated to you, not shared with any other users. It should mitigate the attack.
The flag has a fixed fee in the thousands of dollars and each instance is 10% more expensive.
I can't really see how it would be fixable even with new hardware.
Speculative execution is fundamental to getting decent performance out of a CPU. Without it you should probably divide your performance expectations by 5 at least.
Rolling back all state rather than just user visible state in the CPU is neigh on impossible. When you evict something from the cache, you delete it. Undeleting is hard. There are also a lot of other non-user-visible bits of state in a CPU.
I agree that we'll probably see new attacks in this area for a long time.
That said, the main new ingredient of Spectre seems to be the idea that userspace can poison the branch target buffer to cause speculative execution of arbitrary code in kernel space. That part of the attack should be fairly easy to mitigate with new hardware, by XORing (or hashing) the index into the BTB with a configurable value that depends on the privilege level. So each process has its own "nonce", and they're all different from the kernel's.
Then BTB poisoning won't work unless the attacker knows its own and the other context's nonce. Even if further attacks are found that leak this nonce, they could be mitigated by changing the nonce at regular intervals.
Couldn't you do something like have a separate chunk of "speculative cache" which you only commit to the main cache once the speculatively-executed instructions are retired? Sounds complex, sure - but it seems like that would give you the performance benefits of speculative execution while still being able to roll back (or prevent in the first place) any cache-state side effects when branches were mispredicted. Could also imagine processors start segregating cache by privilege level.
I guess part of the question you're raising is: are there so many different caches, translation buffers, etc. in a modern CPU that keeping 'uncommitted buffers' for the state of all of them would be just as complex as throwing a whole other core in there?
No, that would not be enough. CPUs speculatively execute across multiple branches. Even if you had a separate speculative cache for every code path, you could still build a side-channel from the amount of contention. [1]
> Both hardware thread systems (SMT and TMT) expose contention within the execution core. In SMT, the threads effectively compete in real time for access to functional units, the L1 cache, and speculation resources (such as the BTB). This is similar to the real-time sharing that occurs between separate cores, but includes all levels of the architecture. [...] SMT has been exploited in known attacks (Sections 4.2.1 and 4.3.1)
It effectively means wiping the caches, TLBs, BTBs and any other caches and optimisations on any form of context switch, as far as I can see? Which yes will likely require new silicon.
Then front-runs the negotiated timeline anyway, catching projects like Xen off guard (it seems like)[0]. Will be interested to read the postmortem of the entire process from start to finish, and Xen is promising one from their perspective. I'd be especially interested to understand whether public intel was concrete enough to rush this out the door, because it didn't seem like it was, but I probably missed something.
I reimplemented variant 3 based solely on clues from twitter posts yesterday.
I am by no means a computer security guru - I just did a CPU architecture course at uni and figured I'd cowboy up an implementation. It worked nearly first time, and can read both kernel and userspace pages from userspace by fooling the branch predictor into going down the wrong path, and relying on the permission checks to be slower than the data reads from a virtually addressed cache. It can only access stuff already cached though, so you can't do a full memory dump with it.
speculation was apparently hitting very close to home allowing attackers with resources (think nation states) to start developing their own tooling. at least this early announcement allows people with sensitive data to quickly move to dedicated instances.
edit: well it didn't take a nation state after all: https://twitter.com/brainsmoke/status/948561799875502080 - given that, you can be sure that everybody who counts is frantically launching these on your clouds gathering whatever they can.
As far as I know they HAVE to register a trade in advance. I.E. three months ahead: "I will sell 600 shares on 15th of December if the share price is above 50". This information is public and other people can use this information before the trade actually happens.
Not exactly; it says "we are unaware of any successful reproduction of this vulnerability that would allow unauthorized information disclosure on ARM-based Android devices."
We know that the scariest attack "meltdown", cannot be reproduced on AMD or ARM chips at all[1]. The second attack "Spectre" is also greatly mitigated due the neural network predicting pathways for the application. Thus it's unlikely/less-likely that you'll be able to access other locations in memory[2]. However, it's definitely possible.
There aren't really any special Android ARM CPUs, maybe they are confident it doesn't really work on Android because it's very difficult to get the timing precision and low-level assembly sequences in Java/ART compiled code. Though I wonder how that squares up with JNI.
I think the key to the statement is in any case that you need to differentiate between what is possible on the processor architecture level when you have full software control, and what is possible on an operating system level, where 3rd party applications are further restricted in various arbitrary ways such as only allowed to use Java, limited access to high resolution timing primitives, etc. that can make practical exploitation impossible, even if the flaw is present.
It's difficult to reason about because it's hard to tell if you can manipulate a JIT runtime into generating the code you need for the exploit to work - and as the JavaScript implementations show, the answer is often "yes".
JIT engines (and compilers) often generate a familiar instruction patterns. Many JIT engines Target specific languages (like JS) and as result have "simpler" optimizers (less time to do this) and possibly more stable instruction patterns. So my money is on somebody fuzzing the required JS code.
To be fair, the Intel post alludes to collaborating with AMD/ARM on mitigating Spectre, but userspace memory leaking is wholly separate from kernel memory leaking (Meltdown, which only affects Intel processors).
It's a developing story, but from the information we have so far, it does look like Intel involving AMD is a disingenuous since AMD processors are not affected by the most serious of the issues.
It's too early to say which is ultimately the most real-world serious.
From the Spectre note (which does affect AMD):
In addition to violating process isolation boundaries using native code, Spectre attacks can also be used to violate browser sandboxing, by mounting them via portable JavaScript code. We wrote a JavaScript program that successfully reads data from the address space of the browser process running it.
How quickly are we going to see attacks targeting BTC/ETH wallets, apps etc. on clients and cloud hosted exchanges?
(Edit: there are 9 posts total, go to her user page to see them all)
Seems there are two issues. One, called Meltdown, only effects Intel and is REALLY bad, but the kernel page table changes everyone is making fixes it.
The other, dubbed Spectre, is apparently common to the way all processors handle speculative execution and is unfixable without new hardware.
I’d like to know more about that but I haven’t seen anything yet.
Whoever discovered this stuff on Google’s team deserves some sort of computer security Nobel prize.