| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rayiner 3130 days ago
	Is glorious the right word for it? We’re going back to the stone ages where processors couldn’t predict the targets of indirect jumps. More generally, this seems to me like an attempt to patch out of what is really a class of attacks leveraging fundamental assumptions about high-performance CPU design. Before, OOO just had to preserve correctness and (some of) the order of exceptions and memory operations. Now, it has to preserve (some of) the timing of in-order execution too? Where does this path end?

7 comments

simias 3130 days ago

Legitimate question: on any non-shared non-virtualized system is there any reason to enable these workarounds besides running sandboxed applications such as javascript in a web browser (or flash/java applets/Active X, but those are not really super popular nowadays)?

For any other non-sanboxed application you pretty much have to trust the code anyway. Privilege escalation is always a bad thing of course, but for single user desktop machines getting user shell access as an attacker means that you can do pretty much anything you want.

As far as I can see the only surface of attack for my current machine would be a website running untrusted JS. For all other applications running on my machine if one of them is actually hostile them I'm already screwed.

Frankly I'm more annoyed at the ridiculous over-engineering of the Web than at CPU vendors. Because in 2017 you need to enable a turing complete language interpreter in your browser in order to display text and pictures on many (most?) websites.

Gopher should've won.

sempron64 3130 days ago

This unfortunately also affects almost all mobile apps and modern Windows installations, as they all run Javascript-enabled ads. Maybe this might cause Microsoft to reconsider what it allows to run on Windows but I don't see mobile ads going away any time soon.

tzahola 3130 days ago

>javascript-enabled ads

Good opportunity to get rid of them.

comicjk 3130 days ago

Or, as a compromise, no third-party javascript. Google can easily code up the 100 most common javascript ad formats and let advertisers pick from a menu.

tzahola 3130 days ago

Why would you need a Turing-complete language to describe advertisements? Why can't they be static images? Or static HTMLs?

comicjk 3130 days ago

From the advertiser's perspective, it wouldn't be a Turing-complete language, since they would only have access to standardized templates. Such a system would probably have to be implemented in Javascript at the browser level though, unless you could do it all with CSS animations.

coldacid 3130 days ago

I'd rather just be rid of them altogether. No compromise.

rev_null 3129 days ago

Do they really need 100 formats?

1. Video ad that autoplays.

2. Punch the monkey.

3. ???

panarky 3130 days ago

> on any non-shared non-virtualized system is there any reason to enable these workarounds

Does the non-shared non-virtualized system have any encryption keys in memory that you want to protect?

Do you use full-disk encryption or ssh to other machines or use a cryptocurrency wallet?

simias 3130 days ago

If one hostile application running on my machine isn't sandboxed then SSH local keys are pwned anyway. Might as well install a keylogger or just hijack ssh-agent directly. Full disk encryption keys might not be but the app will have access to any mounted and unlocked safe. Ditto for cryptocurrency wallets without a hardware token (and even with a hardware token if the app can get it to sign a bogus transaction).

I don't think this particular vulnerability significantly increases the surface of attack for any non-sandboxed application running on my computer. There are much easier and straightforward ways to get access to anything an attacker with shell access may want that don't involve dumping the kernel VM. So in my situation the only vector of attack I'm worried about is JS running in the browser since I gave up on javascript whitelisting long ago when I realized that most of the web is unusable when you don't allow heaps of untrusted scripts to run all over the place. I don't have time to audit the source code of every random website I visit.

gmueckl 3130 days ago

These questions are only relevant if you're not controlling and trusting all the code you're running that system. For a consumer system this is true if (and basically only if) you're running a web browser on that system.

If you're confident in the software you're running on a non-shared hardware, both Meltdown and Spectre are non-issues requiring no mitigation. This is a narrow class of systems, but it exists.

acdha 3130 days ago

> For a consumer system this is true if (and basically only if) you're running a web browser on that system.

… which is pretty close to universally true, especially when you consider how many people use apps which are based on something like Electron. If those apps load code or, especially, display ads there's JavaScript running in places people aren't used to thinking about.

sigstoat 3130 days ago

> If you're confident in the software you're running on a non-shared hardware...

and that you won't be hit by a remote execution vulnerability.

gmueckl 3130 days ago

That's what being confident in the software means.

JdeBP 3129 days ago

As the owner of gopher://jdebp.info/1/ and the author of gopher://jdebp.info/h/Softwares/djbwares/guide/gopherd.html , I disagree. GOPHER needs a lot of improvement merely to learn the lessons that people learned with FidoNet in the 1980s.

bsdetector 3130 days ago

Followup legitimate question: the only way to read data is to control the results of a speculative execution or fetch right?

For JavaScript won't it be sufficient to check all the calls out of it so that they can't pass data that controls an exploitable speculative execution, and also generate JIT code so the JS itself can't create exploitable instructions. The API will have to be heavily scrutinized and the JS will run somewhat slower.

If the rest of the browser code is vulnerable, but the JS code can't control the speculative execution then it should be safe to run any JS.

sempron64 3130 days ago

Javascript is theoretically fixable -- what's needed is less fine-grained timing capabilities. I think it would be very hard to completely eliminate the possibility of controlling speculative execution as long as you can predictably invalidate the CPU branch predictor, which can be done even at a high level with if statements. Unless you get rid of the notion of contiguous arrays (making predictable word-aligned cache manipulation nearly impossible) and potentially remove the JIT completely in favor of an interpreter, it's hard to completely be rid of this class of an attack when executing any sort of code on the processor. Those steps are probably possible, but the JS performance hit that ensues would be the death knell of any browser.

That said, without some way of extracting timing at the granularity of 10s of instructions, this attack is moot. So that's likely going to be the mitigation. Unfortunately, the web frames used in some apps are infrequently if ever updated, so JS engine updates there are gonna be hard.

bsdetector 3130 days ago

Why does it matter if they can cause the CPU's predictors to guess incorrectly if they can't control the target address of the branch or memory access?

Example: "if (a < length) return data[a]". If "a" comes directly from JavaScript then they trick the CPU into fetching data[a] even if it's invalid speculation and thrown out. But if there's a safe barrier between "if (a < length) { prevent_speculative_execution; return data[a]}" then they cannot learn anything.

I concede that safely checking all data coming from JS code to the browser would be a huge task, but pretty sure it would work to fix the problem for JavaScript although not in general, between processes with shared IO pages and such.

fyi1183 3129 days ago

Honestly, you probably don't even need the barrier in your example. Getting data[a] into the cache is no information leak if the attacker already knows a. That's why the example in the Spectre paper uses an additional level of indirection.

bsdetector 3129 days ago

Yes thanks to HN being so quick to freeze comments I was unable to fix the example.

Point is, a JavaScript program in isolation cannot read anything, it has to interact with the other target code somehow. If that interaction (the data passed over the API call) can't fail after a certain point and can't be used to read data before that point, then the JS can't read anything.

catnaroek 3130 days ago

> Where does this path end?

It ends with the performance advantages of OOO execution being effectively negated by the workarounds to address the security issues it causes.

The following parable is edifying: https://www.cs.utexas.edu/users/EWD/transcriptions/EWD05xx/E...

tcoppi 3130 days ago

Seems like the ultimate end-game here is to have mini-vms for every process using CPU-level ring protection. If you can't speculate across privilege levels, only inside them, it isn't a security problem anymore.

sandworm101 3130 days ago

Or time to have Kernel live on dedicated cache not ever accessed/shared with anything else. Let the CPU speculate all it wants, just not when playing in the kernel's cache. It may even be time for dedicated kernel cpus/cores.

ufo 3130 days ago

Reading Kernel memory (Meltdown attack) is extra bad but regular user processes being able to read each other's memory (Spectre attack) is also very bad and not solvable by isolating the kernel.

sandworm101 3130 days ago

Im less worried about my steam client reading my chat cache than something inside my web browser reading the keys that encrypt my home directory. Short of abandoning all sharing, the least we can do is isolate kernel cache.

bluGill 3130 days ago

That depends on what you are chatting with. My chatlog would be very interesting to our competitors. The key that encrypts my home directory isn't useful because the firewall blocks your access to my home directory (that a different layer of security).

jacksmith21006 3130 days ago

Why one solution is put secrets in kernel and use meltdown mitigation to protect.

FractalNerve 3130 days ago

> It may even be time for dedicated kernel cpus/cores.

Oh yes, I agree! One needs to be able to phycically (un)lock the "kernel fpga" like a door without remote capabilities, except for server cpu's. Or whatever chip designers believe is a good "physical kernel embodiment" other than fpga.

EDIT: I know it's not really clever, but I would really enjoy hearing any solutions that doesn't try to fix it at the hardware level.

manol74 3130 days ago

Qubes OS [1] does something like that

[1]: https://www.qubes-os.org/

strongholdmedia 3129 days ago

Yet Meltdown nuked exactly that.

frik 3130 days ago

> mini VMs for every process using CPU ring protection

Yes. We should really start to learn from history, MULTICS operating system had already 16 CPU ring support back in the early 1970s. MULTICS is the mother of UNIX, its smaller child. MULTICS had so many advanced features that barely got implemented (often reinvented) in newer OS. It's time to read old docs and ask the old devs who are still alive. (Another such often overlooked gem is Plan9, but it's better known thanks to Go lang devs).

Older Intel CPUs only supported 2 rings. Modern Intel CPU supports only 4 rings. Windows and Linux use ring 0 for kernel mode and ring 3 for user mode. And Intel introduced a ring -1 for VT.

  "To assist virtualization, VT and Pacifica insert a new 
  privilege level beneath Ring 0. Both add nine new machine 
  code instructions that only work at "Ring -1," intended to 
  be used by the hypervisor

It's time for modern operating systems to use more rings, and modern CPUs to correctly protect between different rings.

https://en.wikipedia.org/wiki/Multics

https://en.wikipedia.org/wiki/Protection_ring

tptacek 3130 days ago

We have different utility functions, you and I.

bouk 3130 days ago

tptacek exploits computers for a living, so it's glorious for him :)

tptacek 3130 days ago

It's not that it makes the practice of breaking into computers that much more interesting so much as it makes the underlying field much more interesting to work in. The engineering problems just got a lot more complex. We're all taking an attack vector seriously --- microarchitectural side channels --- that we weren't taking as seriously before, except as an abstract threat to crypto and a way of defeating a mitigation --- KASLR --- that nobody believed in anyways.

What's glorious is that serious software security people now have to start being literate about what it means to reverse engineer and dump the branch history buffers on different CPUs. Getting dragged through this kind of minutiae is the reason I'm still in this field after 22 years.

And I'm just a bystander here. Imagine what it must have been like for Jann Horn over the last several months!

This subsection describes how we reverse-engineered the internals of the Haswell branch predictor. Some of this is written down from memory, since we didn't keep a detailed record of what we were doing.

... because shit was so crazy while they were working this out that they didn't have the cycles to write everything down!

puzzle 3130 days ago

I'd be surprised if other Googlers like Christian Ludloff (also of sandpile.org fame) and Dean G were not involved. They know x86 better than many engineers at Intel/AMD, having been in charge of the most performance/cost critical code (e.g. tuning search serving down to the last cycle), as well as qualifying new platforms and identifying a steady stream of CPU bugs.

ufo 3130 days ago

As someone specializing on a different field, this comment reads to me as if you were an astronomer being excited about discovering a gigantic meteor heading towards the Earth :)

tptacek 3130 days ago

Sure, if your subfield of astronomy was all about practical ways to adapt to living on meteor fragments!

QAPereo 3130 days ago

Some of us would be terrified, but also excited about the prospect of studying a supermassive black hole from the inside. You know what I mean? Some things are just so singular and incredible, and so to the heart of a given field of interest, that “glorious” is a perfect word to describe it. An alien invasion of Earth would be a nightmare, but for some it would be the moment they could finally move past the realm of the hypothetical.

Passion is passion, even when it’s terminal.

FractalNerve 3130 days ago

> they didn't have the cycles to write everything down!

hahaha that was a good pun! Do you have a link to Jann Horns personal blog or Github? I've not never heard of him before.

panarky 3130 days ago

https://github.com/thejh

https://thejh.net/

https://twitter.com/tehjh

tlrobinson 3130 days ago

Isn't that a bit like a firefighter saying your house burning down is "glorious"?

JshWright 3130 days ago

How many firefighters do you know? I guarantee you every one of them gets excited at the prospect of a "good" structure fire.

jacksmith21006 3130 days ago

Maybe more like a meteorologist excited by a really bad storm.

amaranth 3130 days ago

The result might not be glorious but the fire is amazing to watch.

gpm 3130 days ago

To the extent that he exploits computers for a living (e.g. pentesting) and not stops exploits for a living, it seems more like a (presumably law-abiding) arsonist calling houses burning glorious.

tptacek 3130 days ago

Arsonists set fires. Vendors create vulnerabilities, not pentesters.

phyzome 3130 days ago

Ooooh, I dunno, I might argue that vendors have their role in creating pentesters. ;-)

wglb 3130 days ago

I like to think of a lot of vulnerability discovery and research as solving a puzzle. In the sense that this puzzle has so many far reaching implications makes it totally compelling to me. tqbf says "glorious", and I couldn't disagree.

[Edit] Or, how far down does the rabbit hole go?

Additionally, it is quite fascinating to me to compare the complexity of modern CPRUs with, say, a compiler.

wyager 3130 days ago

> leveraging fundamental assumptions about high-performance CPU design.

I believe the generalized fix is to restore the entire CPU state after a mispredict. You’d either need to add an extra copy of the entire processor state (tens of megabits) for every simultaneous predict you support ($$$) or keep track of how to revert all changes and revert them one at a time ($, slow).

paulmd 3130 days ago

This is harder than it seems, because once cache is deleted you can't just un-delete it, you'd have to go back to memory and pull it again.

Only the "extra copy of processor state" thing is really viable. You have to have a speculative cache and buffer in reads that only get flushed to the main cache once they're confirmed to be valid, which is enormously complicated. This facility already exists for writes, but now it needs to exist for reads too.

GP is absolutely correct that this is a fundamental assault on processor design as we know it, the speculative execution concept is going back to the drawing board for a major re-think.

leoc 3130 days ago

I can’t help wondering what igodard’s day is like so far...

paulmd 3128 days ago

"I'm walking on sunshine..."

FractalNerve 3130 days ago

Wasn't Intel's transactional CPU Memory the solution, but it also failed to to bugs?

Sorry for quoting wikipedia, but I'm not at school, hah! [1]

'''' TSX provides two software interfaces for designating code regions for transactional execution. Hardware Lock Elision (HLE) is an instruction prefix-based interface designed to be backward compatible with processors without TSX support. Restricted Transactional Memory (RTM) is a new instruction set interface that provides greater flexibility for programmers.[13]

TSX enables optimistic execution of transactional code regions. The hardware monitors multiple threads for conflicting memory accesses, while aborting and rolling back transactions that cannot be successfully completed. Mechanisms are provided for software to detect and handle failed transactions.[13]

In other words, lock elision through transactional execution uses memory transactions as a fast path where possible, while the slow (fallback) path is still a normal lock. ''''

[1] https://en.wikipedia.org/wiki/Transactional_Synchronization_...

FractalNerve 3130 days ago

Wasn't Intel's transactional CPU Memory the solution, but it also failed to to bugs?