Hacker News new | ask | show | jobs
by orlp 1532 days ago
I don't know about a better book, but I do know how to generate a random float in [0, 1] very efficiently and accurately. A 2007 paper describing the technique is Generating Pseudo-random Floating-Point Value by Allen B. Downey. I have implemented it here in Rust: https://gist.github.com/orlp/121bb95361cc74b88aff79a7335a1a4.... This is for generating in [0, 1), but it is trivially changed to allow 1.0 by changing

    while exponent < -127 || exponent > -1 {
into

    while exponent < -127 {

It uses a single call to the RNG 255/256 = 99.61% of the time, if it has to use an extra RNG call the chance it then needs another after that is 2^-32, ad infinitum, which is negligible. More calls to the RNG could be needed if the exponent it generated was out of bounds which is also incredibly, incredibly unlikely (~2^-33 for [0, 1), ~2^-127 for [0, 1]). Thus the expected number of RNG calls is essentially just 255/256 + 2*1/256 ~= 1.00391.

A friend of mine also made a blog post that includes another implementation and discusses it: https://lokathor.github.io/prng/.

The difference between this technique and virtually all others out there on the internet is that this one is actually unbiased and can generate every single possible floating point value in [0, 1). Unbiased here means that the procedure simulates picking a real number uniformly at random from [0, 1] (with infinite precision), and then rounding it to the closest floating point value. We don't have to use infinite RNG bits to generate our infinite precision real number because we only have to generate enough bits such that it is unambiguous which floating point number we round to. The actual number of RNG bits needed is constant in expectation (see above), but unbounded, because our real number could be arbitrarily close to the boundary between two floating point values.

3 comments

The Python docs¹ include an implementation of Allen Downey's algorithm:

    from random import Random
    from math import ldexp

    class FullRandom(Random):

        def random(self):
            mantissa = 0x10_0000_0000_0000 | self.getrandbits(52)
            exponent = -53
            x = 0
            while not x:
                x = self.getrandbits(32)
                exponent += x.bit_length() - 32
            return ldexp(mantissa, exponent)
¹ https://docs.python.org/3/library/random.html#recipes
That's almost correct, but not quite (depending on your definition of correct). It fails to account for the problem that Allen Downey's paper so carefully took care of. I think it's best shown with an image (exaggerated to have a 2-bit mantissa and 2-bit exponent):

https://i.imgur.com/5TerZkm.png

Top is Python, bottom is correctly rounding to nearest float. Python's implementation does not round correctly to the nearest floating point value, it floors. This means it oversamples those floating point numbers with mantissa 0, and can not sample an inclusive range at all.

That can't be right:

  // Returns k with probability 2^(k-1).
Did you mean 2^(-k-1)?
Yes, woops, fixed.
FWIW that algorithm for uniform floating-point sampling in [0,1] is actually originally due to: Walker, Alastair J. “Fast generation of uniformly distributed pseudorandom numbers with floating-point representation.” Electronics Letters 10 (1974): 533-534.