Hacker News new | ask | show | jobs
by jacquesm 4659 days ago
This whole discussion about rdrand reminds me of people arguing about what strength the secondary lock on their upstairs back window should be when the downstairs floor has single pane glass windows all around.

Even if rdrand is backdoored it would have to be a significant supplier of entropy in the resulting random number for this to be a meaningful attack vector, as soon as you mix it with other (good enough, large enough) sources of entropy you get a situation where some other attack is more likely to be far more feasible than to use the knowledge about some of the bits that rdrand contributes to the entropy pool.

Such as:

  - good old b&e and placing a keylogger or hardware bug
    (very easy to hide in a keyboard)

  - a compromised bit of the OS

  - compromising the application that you use to encrypt your messages or finding a significant weakness in the application.

  - doing any of the above with the recipient
2 comments

Did you actually look at how the Linux kernel is mixing RDRAND output with other randomness, or read the comments by the author of the original change.org petition? Because of the way Linux mixes RDRAND output with other entropy using XOR, a malicious RDRAND implementation can easily make the output of /dev/random totally determinisitc whilst being completely indistinguishable from a correctly-functioning implementation except to the attacker.

All it has to do is detect the code sequence in question and XOR the output of RDRAND with the randomness from the other entropy sources before returning it. The two XORs cancel out, and this is completely undetectable because there's no way to distinguish between a true random bitstream, a good PRNG, and a good PRNG XORed with data you provided based on the bits themselves.

I keep hearing this argument, but I don't feel like it's relevant to RDRAND. Let's say the numbers are generated by by XORing RDRAND as "a" and the other parts as "b", such that for any given call:

/dev/random = a XOR b

If the NSA only knows "a", that's fine, "b" is still pretty random. They can't compromise the randomness of "b" unless they know "b".

Now if they know "b", then we're screwed whether we use RDRAND or not, and safe encryption using Intel chips is just impossible. However I don't think anybody is suggesting that.

There's a difference between the NSA being able to add a malicious circuit into a CPU that has access to "b" and being able to leak the value of "b" to systems they control. Thankfully, in the case of RDRAND they don't have to do the latter - they can just neutralize the effect of "b" on the result on the CPU itself.
All it has to do is detect the code sequence in question and XOR the output of RDRAND with the randomness from the other entropy sources before returning it.

How is that going to work? i.e. how is RDRAND going to 'detect the code sequence'?

RDRAND wouldn't, the control unit would. Whenever it sees the XOR macroinstruction it checks the second operand to see if it's RDRAND. If so, it doesn't order an XOR; rather it just copies the RDRAND value to the first operand address.

That's the straightforward way of doing it. The 'finesse' would be to leave RDRAND as a secure random source, but in the case of it being used as an operand of XOR, simply to ignore RDRAND entirely, substituting an insecure stream. The advantage, other than reduced risk of detection, would be that asynchronous access to RDRAND wouldn't scramble the otherwise breakable output.

Only Intel engineers know exactly how to do this and I doubt they're allowed to reveal hardware internals, but at the point RDRAND actually executes the next fewt instructions should have already been decoded and the data flow between them analyzed. In theory it's not terribly hard to use that information to change the behaviour of RDRAND.
Honestly, and for lack of a more suitable expression, put up or shut up. If you think rdrand actually reads back the output of the RNG from RAM in order to nullify it, then show it.

It's actually possible, you can verify that the timing of the instruction conforms to what it's supposed to be doing, you can check for RAM access. RAM accesses are slow and easy to detect (I'm sure there even are hardware counters for that kind of thing on modern CPUs).

So unless you can get any kind of hard evidence that would even shed the base of the idea of a doubt about what rdrand is doing: this is pure FUD.

Finding out how rdrand is truly implemented is hard, but if it's truly the evil instruction of doom that sends images from your webcam to the NSA then it should be trivial to prove it's not behaving as it should.

Instead of saying put up or shut up, let's think if this is within the capabilities of Intel or an impossible feat.

First off, the RNG doesn't have to reside in RAM as it could already be in cache. So you're already not going to be detected by looking at RAM access. Also, it's not 1992. Modern architectures and modern operating systems are going to throw out instruction timings from Intel manuals. A cache miss and you're toast.

Now if you have a dedicated pipeline to executing a RNG within a code cache, all you would have to do is work out it's inverse. Very plausible.

Unless the above sounds magical, it does seem like this is a possibility. And as it's been shown that the NSA is using it's enormous budget to pay US companies to help do it's bidding, this does seem like it's within reach.

Aren't the next instructions going to be in the code cache? So "detecting the code sequence" would seem trivial.
Remember, this is a hardware implementation you're talking about. Nothing is ever "trivial". It would take a significant amount of extra silicon to add this kind of detection logic.

The reason for the basic paranoia about not trusting RdRand directly is that it's pretty easy and cheap to make it generate a random number stream that looks random, but is predictable (the RdRand function already is documented to use AES; all you would need to do is make it do AES of an incrementing integer sequence, rather than actual random noise, which is a pretty small change). And heck, if RdRand isn't backdoored (no one has presented evidence that is is; it's just a standard level of paranoia because subverting the random number generator is a favorite technique of the NSA), it might be in a future version, or AMD or ARMs implementation of a similar instruction in the future may be.

Detecting a code sequence and subverting it would be far more difficult. For one, there's the extra silicon. There's the extra chance of that change introducing other noticeable behaviors. There's the extra chance of discovery. It's just not worth the costs. And furthermore, if you really are worried about that, then there's no reason to limit your paranoia to the RdRand function; you may as well say you can't trust the chip to run any crypto code at all.

We can already rule out the extra silicon costs. Don't forget that a program like this one would be subsidised.

If you can't trust a chip with one instruction, why trust it with the others. I'm in no disagreement with you here. I was just responding to jgrahamc asking how it was going to work.

> All it has to do is detect the code sequence in question

Extraordinary claims...

> question and XOR the output of RDRAND with the randomness from the other entropy sources before returning it.

How is that easy ? No, predicting or detecting that the returned value of your assembly instruction will later be xor'ed by some other value, in all it's machine code variants that different versions of gcc will produce, is not easy.

It is theoretically possible if you have access to the CPU design and can modify it, but even then it is very non-trivial, if even doable in the general case.

There are several free CPUs around you can instantiate on an FPGA and boot linux on - if someone makes a proof-of-concept rdrand() on one of these that can detect the future bit operations on the value(even when it's moved to another register or to/from main memory) and cancel out that bit operation - then I'll believe it's possible.

Until then, I'm more(more compared to not at all is still very little) worried that:

* the chip (whose part number google knows nothing about) in my dsl modem have a backdoor and being able to mirror all its traffic

* that the baseband chip in my HTC has the same ability - in addition to the know ability of being able to report its gps location without informing me

* that the NSA probably still can read my gmail mail

* that my raspberry pi SoC can contain an unknown component that dumps it's memory out the ethernet card

* that the latest iPhone perhaps complies nicely with the 3GPP TS 33.108 spec.

Did you read the article? They explained why an implementation of RDRAND as "XOR together the contents of all registers and return it" would result in removing nearly all of the entropy in the state vector. And it proposed a simple solution: modify the code so that the hardware entropy is mixed in earlier in the process (in which case it WOULD require the prodigious feats you are talking about).
get_random_bytes() documentation:

    This function is the exported kernel interface.  It returns some
    number of good random numbers, suitable for key generation, seeding
    TCP sequence numbers, etc.
Here is the accepted commit that makes get_random_bytes() use RDRAND directly:

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.g...

Note that this was version 3 of that patch. Versions 1 and 2 also took control of /dev/urandom. Here is v2:

http://thread.gmane.org/gmane.linux.kernel/1173350/focus=117...

A year later, Ted Ts'o made get_random_bytes() go through the usual entropy pool and added get_random_bytes_arch() for a consumer that doesn't want to go through the entropy pool. (The core kernel does not currently use get_random_bytes_arch() anywhere.)

http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.g...

Note that the heated discussion came before v3 of Anvin's patch, and thus /dev/urandom was included. Matt's objections were perhaps not expressed very clearly, but Linus was pretty cavalier in overruling Matt Mackall (the /dev/random maintainer at that time) and I think his retort to George Spelvin's very rational objection was unreasonable:

http://thread.gmane.org/gmane.linux.kernel/1173350/focus=117...

I find it scary that these commits made it as far as they did. Note also that on the day of the leaks (ironic timing), Ts'o had to shoot down a RedHat engineer proposing to once again make get_random_bytes() bypass the kernel entropy pool.

https://lkml.org/lkml/2013/9/5/212

As for other locations, the point is to be undetectable while deployed at massive scale. Keyloggers and active backdoors are much higher risk. Great for a targeted investigation, but terrible for untargeted passive surveillance.

https://plus.google.com/u/0/117091380454742934025/posts/XeAp...