OpenBSD bug in the random() function

Y	Hacker News new \| ask \| show \| jobs

	OpenBSD bug in the random() function (banu.com)
	31 points by muks 5211 days ago

4 comments

Yrlec 5211 days ago

While not exactly a bug, but if you run this code in Java:

  for(int i = 0; i< 100; i++){ 
  	Random random = new Random(i);                                       	
  	System.out.println(random.nextDouble()); 
  }

It prints the following sequence (at least on JDK 7 and Win 7):

  0.730967787376657
  0.7308781907032909
  0.7311469360199058
  0.731057369148862
  0.7306094602878371
  0.730519863614471
  0.7307886238322471
  0.7306990420600421
  0.7302511331990172
  0.7301615514268123

I know that you're not supposed to recreate the Random-instance like that but it's still a bit odd that the initial values in each sequence are so similar to each other.

link

joesb 5211 days ago

When you created a Random instance and pass in a long value (i), it is used as a seed. And then the seed used to generated random number.

Your seed value from 0-100 is only varies in the last 6 bits out of 64 bits. Which I assume probably caused this whatever psuedo-random function Java is using to generate very similar value for seeds with that low level of entropy. You can look up the formula in Java SDK and do the math.

There's no need to pass in seed value to Random constructor unless you really want to reproduce the same random sequence.

link

Yrlec 5211 days ago

Any reason for the down-vote? I honestly want to learn.

link

gcp 5211 days ago

The Java Random class uses a 48-bit LCG with a 35-bit multiplier. Because of this, small seed values won't be able to "wrap around" the full range of the LCG and will cause starting sequences that are all but random relative to each other.

Put differently, you're seeing that 35/48 = 0.73.

I'd consider this a bug in Java, but it's a common one. Qt has the same problem. Could have been avoided by cycling the seed through the LCG once, instead of using XOR.

link

Yrlec 5211 days ago

Interesting, thanks! Any particular reason they limit the multiplier to 35 bits and the output to 48 bits?

Edit: just noticed that Java limits the output to 32 bits, not 48 (http://en.wikipedia.org/wiki/Linear_congruential_generator). How does it create 64 bit values, like long and double?

link

gcp 5211 days ago

Any particular reason they limit the multiplier to 35 bits and the output to 48 bits?

Good question. There appears to be no good justification for this, but the generator is guaranteed by the docs. So it's possible the initial implementation was bad and everybody is required to follow it since.

link

marshray 5211 days ago

Probably it generates 32 bits twice.

link

tedunangst 5211 days ago

I know nothing about java's random number generator, but your seeds are also very similar to each other.

link

marshray 5211 days ago

For any decent RNG, even a lightweight one, that shouldn't matter.

link

ajb 5211 days ago

Indeed. When Knuth found was informed of this problem in a random number generator of his, he fixed it: http://news.ycombinator.com/item?id=3730348

There is an obvious use case for this: you have a test which is run N times where you want to have different random numbers in each run, but you also want to be able to go to run X and debug it without running all the previous ones.

link

tedunangst 5211 days ago

Did you intend to link to this page? I'm already here.

link

ajb 5211 days ago

Crap, no, and it's too late to edit. I intended to link to: http://www-cs-faculty.stanford.edu/~uno/news02.html#rng

Thanks.

link

gospelwut 5210 days ago

Same sort of thing happens with C#/.NET as well.

link

calloc 5211 days ago

Or why maybe using random() is a terrible idea. Use arc4random() instead on FreeBSD/OpenBSD/Mac OS X for a MUCH better random number generation, and best of all it is auto-seeded.

Obligatory XKCD: http://xkcd.com/221/

link

DHowett 5211 days ago

The only caveat is that you then have to wrap it in an #ifdef if you want source portability. The thing random() has over arc4random() is exactly that - it's part of the standard C library on most platforms.

For discussion: Why is random() not already arc4random() on platforms that provide the arc4 variant? Is it for speed's sake? Different implementations of libc functions will seed differently, so it's not a cross-platform seed stability concern. Is the problem that you can't seed it with a fixed value and get the same pseudorandom sequence?

link

caladri 5211 days ago

Yes, because of the need for pseudorandom sequences. In FreeBSD this comes up every so often, but the reality is that there's a lot of AI and simulation/modeling code that uses the libc random functions (either rand(3) or random(3)) and expects reproducible behavior with the same seed both throughout the life of a program and across multiple executions.

link

harshreality 5211 days ago

That could easily be accommodated by implementing random_ng() to take an optional buffer that the PRNG would use to initialize its state. If a buffer is not passed, use a random or pseudorandom entropy source... whatever's available on the system. From ivy bridge on, intel cpus will have the RdRand instruction, or there's /dev/urandom.

That offers the best of both worlds. If you want repeatably, initialize random_ng() with a known buffer. If you want reasonable unpredictability, let the PRNG initialize itself using whatever it wants. (Not to confuse that PRNG with good entropy randomness that might be accessible from RdRand, or which is usually obtained by asking the user to move the mouse.)

link

caladri 5211 days ago

Right, and there are other RNG and PRNG sources and interfaces for precisely that reaon. The question was why random(3) isn't arc4random(3).

link

tptacek 5211 days ago

You use random() when you need the statistical appearance of random numbers, and potentially the ability to generate the same sequence deterministically. It's not intended for the same use case as arc4random() (which is itself probably not one of the best CSPRNGs).

link

dfc 5211 days ago

What CSPRNG would you recommend?

link

tptacek 5211 days ago

Couple things:

(i) arc4random is among the older of the widespread CSPRNGs (WinAPI CryptGenRandom is of the same vintage but has been updated).

(ii) arc4random is an implementation of RC4, which is not a well-regarded stream cipher particularly with regard to biases.

(iii) As I recall (note: I could be totally wrong here) arc4random depends on RC4 as its entropy management function; modern designs tend to use secure hash functions for this.

(iv) arc4random in isolation implements only one component of what Ferguson and Schneier would call a cryptographically secure random number generator (the "generator"); it doesn't handle entropy gathering, it doesn't handle heterogenous entropy sources, it doesn't manage entropy pools, and it doesn't build in functionality for handling cold starts (ie, it doesn't inherently persist its state).

One issue that encapsulates all of these (and also blunts any criticism you might perceive of OpenBSD in this comment) is that arc4random as implemented in OpenBSD is not the same thing as openbsd-lib/arc4random.c; OBSD handles entropy gathering, for instance, in the kernel. On the other hand, FreeBSD ignored some of these issues back in 2008 and had a cold start problem. Point being: if you just grab arc4random.c from OpenBSD CVS and stick it in your project, you probably hurt your security.

I like Thomas Pornin's suggestion to this on Stack Overflow: take AES (you probably have it in hardware on modern Intel chipsets) and run it in CTR mode, with random (from random sources) entropy keys (expanded with a fast secure hash), rekeying regularly; this has the advantage of keeping the RNG state in some sense away from attackers as well.

But of course what you really ought to do is just use /dev/random.

link

dfc 5211 days ago

Thanks tptacek. I'm not sure you are allowed to be wrong about questions like this, so next time make sure you are prepared;)

For posterity's sake if anyone is interested in pornin's suggestion at SO:

http://stackoverflow.com/a/3532136/915268

link

saalweachter 5211 days ago

Obligatory Dilbert: http://dilbert.com/strips/comic/2001-10-25/

link

michaelni 5210 days ago

One issue with the OpenBSD "bug" that i think hasnt been mentioned is that while openbsds srandom(0) leading to a 0 sequence sucks. The fix everyone is using (including up to date OpenBSD trunk) causes srandom(X) and srandom(Y) to produce the same sequence for at least one pair of distinct X and Y. This probably is less an issue but still. For example linux debian with gnu libc produces the same sequence after srandom(0) and srandom(1). Namely 1804289383 846930886 1681692777 1714636915 ...

link

dfc 5211 days ago

It seems like the right thing to do would be to spend the time composing the email to tech@o.o and then write the blog post.

link

marshray 5211 days ago

I've tried pointing out a deficiency in the system RNG to those guys before.

They're not as grateful as you'd think.

link

gonzo 5211 days ago

OpenBSD is full of navel-gazing.

link