Hacker News new | ask | show | jobs
by ericseppanen 3231 days ago
The full paper is here: https://www.usenix.org/system/files/conference/woot17/woot17...

I am unconvinced. They have not demonstrated an attack on any real-world SSD; instead they have attacked their own FPGA-based design.

The attack assumes that you can control or predict the physical location of data on an SSD, which is unlikely on a system that is doing other I/O.

But worst of all, the "attack" assumes that if you can target the right physical block/pages in flash, you can somehow hit that location with sufficient read-disturb that the result will decode successfully, AND pass ECC checks, meaning the resulting bad data will be returned to the host system.

I am highly skeptical that this could ever work on a real SSD. The combination of BCH/LDPC error-correction codes combined with a final checksum should make "random bit flipping" impossible to leverage.

Oh, and there's one more thing: SSD firmware keeps counters, to ensure that read disturb can't corrupt data. Any read pattern that hammers a particular location will trigger garbage collection or data rewrite to a fresh location.

4 comments

Author here, I would like to set the record straight.

We do not claim to have an attack on SSDs. The journalist seems to have misunderstood and not read the paper. The attack demonstrated is not on an FPGA or SSD.

The main point this paper makes and demonstrates is that if you can cause corruption of a full block (i.e., completely garble contents of a chosen block), then you can elevate privileges (with some assumptions, like using ext3). Note that this result does not depend on whether you are using an SSD, a disk, or any other storage for your filesystem.

Are you claiming that a random-bit-flipping attack such as targeted read disturb can cause corrupted data to be returned even through data scrambling, a first-level LDPC check and a final CRC check on the output?

From your paper: "We assume that the victim system runs a filesystem on top of MLC NAND flash-based SSD."

It seems very naive to be surprised that people would assume this is an attack on SSDs.

The flash weakness is clearly documented as just being part of their threat model, not part of their research. They say that their contribution is in the filesystem part of the attack, to build on a weakness proposed by a previous flash layer focued paper. So this is completely OK.

If you want to critique the flash paper, or how this paper represents that papers findings, you should turn your attention to:

Yu Cai, Augata Ghose, Yixin Luo, Ken Mai, Onur Mutlu, and Erich Haratsch. “Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques”. In: 23rd IEEE International Sympo- sium on High Performance Computer Architecture . 2017.

I found a PDF link too: https://pdfs.semanticscholar.org/b9bc/a3c9f531002854af48de12...

I agree the earlier paper shares the same misconceptions.

I don't agree that the authors of the present paper are exempt from criticism for this reason.

Still in the introduction you write:

Based on a recently published paper by Cai et al. [2] that proposes that rowhammer-like attacks are possible on SSDs but does not present an actual attack, we investigate the feasibility of such attacks on SSDs from the system point of view.

So it might be easy for a non-technical reader to jump to that conclusion.

Academics don't write publications in sloppy-journalist-proof ways, though. And that's fine, they have more important audiences they are writing for.
Not sure if you've ever actually been in academia, but on any type of publication (paper, thesis, etc.) it is very well understood that title, abstract, introduction and conclusion are "for the masses" while the rest is for the interested (and assumed-to-be-"qualified") reader.

However, I agree that we should expect science journalists to be in the latter group.

So I see failures in both sides of the communication.

So how would you have written the quoted bit? It seems pretty masses-friendly to me, any addition of disclaimers or weasel words would just detract from it.

To raise an earlier quote, a central sentence from the abstract: "In this paper, we discuss the requirements for a successful, full-system, lo- cal privilege escalation attack on such media, and show a filesystem based attack vector. " This is also a good description for the masses, and only a very sloppy journalist would read past that and jump to premature conclusions.

(side note: I don't think there's any need to get into credentials about who's been in academia. There are lots of terrible writers, and minority opinions about writing, in academia.)

The quote conveys what the author does. That the average journalist assume that it means they have carried out an attack is on their shoulders, because it clearly states they are only looking at it from the filesystem.
The main point this paper makes and demonstrates is that if you can cause corruption of a full block (i.e., completely garble contents of a chosen block), then you can elevate privileges (with some assumptions, like using ext3).

That's an entirely unsurprising fact, especially if you've ever played around with cracking/patching. A single-bit change in the right place is sufficient to turn an "are you root/registered/privileged/etc.?" check into its negation. This isn't anything novel or unexpected to anyone who knows how software works.

This is not about having control over the changes (flipping a bit in a file, say), but rather about random corruption.

Also this is not a journal paper, this is a workshop (Usenix "Workshop on Offensive Technologies") which is meant as a kind of get together of academics and practical/industry guys. So just demonstrating an "theoretically obvious" exploit would be fine content for that venue, especially if it's not been academically documented before.

If there's anything we've seen, over and over again, it's that theoretical and infeasible attacks eventually become, in order:

1) possible

2) feasible; and

3) reliable to the point of weaponization

It may take 5 years. It may take 20 years. It will invariably require a huge amount of other research, only some of which will appear relevant. Then all of a sudden the intermediate pieces are all understood and the first practical attack becomes possible.

Even if this attack only works against an ideal target, it still shows a new way of thinking about particular attacks.

> Any read pattern that hammers a particular location will trigger garbage collection or data rewrite to a fresh location.

I can't help thinking that you may have inadvertantly outlined how an eventual practical attack will be performed. This wouldn't be the first time a mitigation method is abused to prepare an attack either - what if you had statistical methods at your disposal to predict how the SSD's wear-leveling redirects your writes? Could you arrange for the cells to be rotated in and out in a reliably determinable pattern?

I'm not discounting your doubts, btw. I'm just pointing out that dismissing the attack due to its current sophistication (or lack thereof) feels shortsighted.

> If there's anything we've seen, over and over again, it's that theoretical and infeasible attacks eventually become, in order:

In general, yes it's always good to keep in mind that just as technology progresses exponentially, technological attacks also progress exponentially.

BUT, theoretical attack -> weaponized attack is hardly an axiom. To take a page out of history which I believe is apropos, let us recall the old myth of recovering data from an erased hard drive.

Way back in yonder years it was widely believed that three letter agencies could take any hard drive that had been erased, and recover all the data by carefully analyzing the residual magnetic flux. A single erase, the theory went, wasn't enough to fully wipe the magnetic signal.

The idea was so pervasive that security obsessed peoples would wipe their drives 6, 7, maybe even 8 times just to be sure. That'll stop those three letter agencies!

Well, as time went on it turned out the theoretical attack became less plausible and less feasible! We have no evidence that such a technique was _ever_ used. And while, in theory, it _may_ have been possible when the myth started, the relentless march of platter density rendered it less and less feasible as time went on.

It's hard to know what attacks will follow the exponential curve upward towards weaponization, and which will follow it downward to obscurity. Best to just keep your wits about you, I say.

I don't think we can assume that all impractical attacks will eventually become feasable.

There are some things that are not just hard, but computationally infeasable. Triggering random bit errors and expecting to pass both the LDPC error correction as well as the extra checksum probably falls in this category.

I'm afraid I don't follow your suggestion that triggering SSD GC could somehow result in some other attack. This is simply the firmware automatically repairing the damage you were attempting to inflict. I don't see an additional attack vector here.

Since flash is already an unreliable media, hardware & firmware already works very hard to conceal and silently repair any errors before they accumulate to a data corruption scenario. This is very different from a rowhammer-type attack because there is an active CPU that already works to prevent this type of damage when it occurs naturally (or due to a naive workload that reads hot locations often).

> I'm afraid I don't follow your suggestion that triggering SSD GC could somehow result in some other attack.

I was thinking more of the wear-leveling of the NAND cells. (Sibling comment from wtallis points out that the entire technology is being phased out so that's pretty much covered then.)

What I had in mind was a write-spray to identifiable locations. Wear-leveling cycles cells out from active to inactive, and from inactive back to active. If you could prepare a whole bunch of cells with suitable patterns, AND had a way to get occasional cells cycled in uninitialised - then having predictable control over "where"[ß] a cell is cycled back in could allow to target the reads and writes to perform the attack.

We don't need control over which cells are cycled in if majority of incoming cells already have our data on them from their previous active incarnation.

ß: There is indirection above the physical cells and their addressing. I just don't know how many layers.

That's not how SSDs work. You would never be exposed to uninitialized flash pages; they are unlinked from the logical address space until after the block gets erased and programmed with fresh data. Wear leveling doesn't change that process at all.
> It may take 5 years. It may take 20 years. It will invariably require a huge amount of other research, only some of which will appear relevant. Then all of a sudden the intermediate pieces are all understood and the first practical attack becomes possible.

Except that the NAND flash that's vulnerable to these attacks is being phased out of production as quickly as the fabs can be converted. Coming up with more plausible ways to obtain the oracular knowledge necessary to properly target this attack is of no use if the underlying storage medium no longer has the failure mechanism that's being exploited.

And another thing, you have little or no access to the physical layout of the SSD - you are writing through a complex algorithm that does wear levelling, block remapping and lazy deletions. To write to NAND flash, you need to erase the entire block first.

Perhaps you could corrupt the DRAM cache of the SSD, though?

It should be noted that while the disks do keep counters their space for such counters is small and you can sometimes over it and do some places are forgotten. I've seen it happen on HDDs and had this as a root cause of some repeated failures at some customers due to their specific workloads and their interaction with the system.

I do support you in the general point that this is rather unit to work on SSDs but it may work on HDDs as they have a similar failure mode and physical locality is easier to achieve there on them.

That said they unlikelihood didn't make it impossible. Filesystems should be written and tested against such attacks.