Hacker News new | ask | show | jobs
by jlebar 1534 days ago
I remember being disappointed by this book.

I wanted to read about algorithms for choosing a random float in [0,1], and IIRC its only suggestion was randint()/INT_MAX. This is wrong in part because the smallest nonzero value you'll ever choose is 1/INT_MAX, which is much larger than the smallest nonzero float.

Is there a better book out there? I was hoping for something fabulous like Hacker's Delight but for floating point.

7 comments

I don't know about a better book, but I do know how to generate a random float in [0, 1] very efficiently and accurately. A 2007 paper describing the technique is Generating Pseudo-random Floating-Point Value by Allen B. Downey. I have implemented it here in Rust: https://gist.github.com/orlp/121bb95361cc74b88aff79a7335a1a4.... This is for generating in [0, 1), but it is trivially changed to allow 1.0 by changing

    while exponent < -127 || exponent > -1 {
into

    while exponent < -127 {

It uses a single call to the RNG 255/256 = 99.61% of the time, if it has to use an extra RNG call the chance it then needs another after that is 2^-32, ad infinitum, which is negligible. More calls to the RNG could be needed if the exponent it generated was out of bounds which is also incredibly, incredibly unlikely (~2^-33 for [0, 1), ~2^-127 for [0, 1]). Thus the expected number of RNG calls is essentially just 255/256 + 2*1/256 ~= 1.00391.

A friend of mine also made a blog post that includes another implementation and discusses it: https://lokathor.github.io/prng/.

The difference between this technique and virtually all others out there on the internet is that this one is actually unbiased and can generate every single possible floating point value in [0, 1). Unbiased here means that the procedure simulates picking a real number uniformly at random from [0, 1] (with infinite precision), and then rounding it to the closest floating point value. We don't have to use infinite RNG bits to generate our infinite precision real number because we only have to generate enough bits such that it is unambiguous which floating point number we round to. The actual number of RNG bits needed is constant in expectation (see above), but unbounded, because our real number could be arbitrarily close to the boundary between two floating point values.

The Python docs¹ include an implementation of Allen Downey's algorithm:

    from random import Random
    from math import ldexp

    class FullRandom(Random):

        def random(self):
            mantissa = 0x10_0000_0000_0000 | self.getrandbits(52)
            exponent = -53
            x = 0
            while not x:
                x = self.getrandbits(32)
                exponent += x.bit_length() - 32
            return ldexp(mantissa, exponent)
¹ https://docs.python.org/3/library/random.html#recipes
That's almost correct, but not quite (depending on your definition of correct). It fails to account for the problem that Allen Downey's paper so carefully took care of. I think it's best shown with an image (exaggerated to have a 2-bit mantissa and 2-bit exponent):

https://i.imgur.com/5TerZkm.png

Top is Python, bottom is correctly rounding to nearest float. Python's implementation does not round correctly to the nearest floating point value, it floors. This means it oversamples those floating point numbers with mantissa 0, and can not sample an inclusive range at all.

That can't be right:

  // Returns k with probability 2^(k-1).
Did you mean 2^(-k-1)?
Yes, woops, fixed.
FWIW that algorithm for uniform floating-point sampling in [0,1] is actually originally due to: Walker, Alastair J. “Fast generation of uniformly distributed pseudorandom numbers with floating-point representation.” Electronics Letters 10 (1974): 533-534.
If you want to generate a uniform random float in [0,1), half your results have to be in the [0.5,1) range. That space is half as dense as the [0, 0.5) range. (And this applies again for [0.25,0.5) vs. [0, 0.25), etc, and so on. And this is also disregarding the whole space of denormalized floats.) So regardless of your method, you're going to have to ignore nearly all the floats in the range! (Or generate many more than 32 bits and use them non-uniformly to make a uniform output - see orlp's comment.)

There are other problems with what you say their suggestion was, e.g. INT_MAX should be RAND_MAX, and you need to take some care with rounding/conversion, and inclusive vs. exclusive upper bound requires careful attention. But with any simple approach you just can't use most of the small float space, and so the division approach with the details done correctly is pretty common!

Edit since rate-limited: orlp, I think we're only in disagreement about what qualifies as "very efficient". It's an efficient implementation of the more complex problem, but it's also much slower than the division. I wouldn't want it to be the default float generator for a general-pupose language - even most scientific modeling or statistical analysis code won't benefit from the increased output space. OP should think very hard about how they intend to use values in that range before insisting they really need it - the division approach really is sufficient and NR is right to recommend it by default!

You can make this approach very efficient using bitwise operations, and the count trailing zeros instruction common on many CPUs. See my other comment for a reference to the paper describing the technique and a link to a low level implementation by me.
Because of how floating-point numbers are distributed, you should expect a random number drawn uniformly from [0,1] to basically never hit the smallest floating-point number: hitting something in the range [0, 2^-n] should happen with probability 2^-n.

I agree there is much more one could want out of a uniform random float generator than just k/N where k is a integer drawn uniformly from {0, 1, …, N}, but hitting arbitrarily small floating point numbers is not on this list.

If you are intending to sample numbers in the range [0,1] such that each float occurs with equal probability then that’s fine, but it’s certainly not uniformly random on [0,1].

Steelmanning their argument, an ideal output would consist of every possible float in [0,1] occurring with an appropriate probability -- some of those vanishingly unlikely. Kind of like how randint() should be capable of returning every possible integer rather than, say, every multiple of 64.

It's fine to counter that most of the time the difference wouldn't matter or that it might be problematic to compute for some reason or another, but I don't think it's reasonable to critique the idea on the grounds of the result not being uniform when they didn't actually ask for each float to have equal probability.

Floating point numbers are represented in a form mantissa*2^exponent and they are distributed non-uniformly over their dynamic range. The density of normalized 64-bit floats around zero is about 10^-320, while around one is about 10^-16. Therefore, do not expect to get the smallest positive float in sampling from a uniform distribution over [0,1]. There is a good tutorial about floating point numbers and their pitfalls written for computer scientists [0] where Figure 1 on page 7 (pdf page count 3) shows how non-uniformly floating point numbers are distributed over the number line.

[0] https://pages.cs.wisc.edu/~david/courses/cs552/S12/handouts/...

I think the OP knows that, but complains about the fact that randint()/INT_MAX (which my mind autocorrects to rand()/RAND_MAX) takes on at most RAND_MAX+1 different values (that _may_ be as low as 32,768), and returns each one with equal probability, so that it returns zero 1 in (RAND_MAX+1) times, on average, and won’t ever return most possible floats in [0,1]

Even if you decide that returning only RAND_MAX+1 different values, I think you should divide the [0,1] interval into RAND_MAX+1 equal sized intervals and return the floating point value at the center of one of such intervals. The given solution has both zero and one stand for a value in an interval of size 1/(2RAND_MAX), but picks them with probability 1/RAND_MAX, so it picked both of them way too often.

Aside from that there’s:

- rand(), on many systems, is a very poor random number generator.

- the API of rand()* is poor. Because there’s only one global state, you can’t run multiple independent generators, and it doesn’t help you much as soon as you want to generate anything other than a discrete distribution with RAND_MAX+1 items (a number that even is implementation dependent, so that portable code needs special care even if it may not be needed on most platforms). For example, because RAND_MAX may be even, you can’t even simulate a fair coin toss with a single call of rand().

If you want a random number in [0,1] with a uniform distribution from N random bits, I don't see how you can expect an epsilon of less than 1/(2**N-1)
Sure, but who said N has to be sizeof(int)? And besides that, the distance between x and successor(x) is not constant throughout the range of [0, 1).
N is sizeof(float), which is conveniently the same as sizeof(int).

> And besides that, the distance between x and successor(x) is

That is approximately balanced by using random() int / MAXINT

Choosing a random float is not the same as choosing a random number.
What applications call for a "random float in [0,1]" with distribution defined in the space of fp representation and not the usual kind?
For example you when you want to do fuzzing of some algorithm that uses floats, a random float might come in handy.
In [0,1]?

(I consider this answer nearly disingenuous. It's obviously not what OP was talking about.)

Choosing a random float doesn't mean very much without a distribution.
Randomness is not a property of numbers but of distributions.
Actually I believe it is better to say randomness is a property of a sequence of numbers. So a "random sequence" has members chosen independently of each other, but all from the same probability distribution. And random number generators create random sequences which follow a given distribution.
You can apply the core ideas of Hacker's Delight to floats too. Your example problem, taken from [0]:

  float frand( int *seed )
  {
      union
      {
          float fres;
          unsigned int ires;
      };
  
      seed[0] *= 16807;
      ires = ((((unsigned int)seed[0])>>9 ) | 0x3f800000);
      return fres - 1.0f;
  }
You basically assemble a float building a random mantissa and assigning the right exponent and sign bits.

It's not portable, but it's a very neat trick!

[0] https://iquilezles.org/www/articles/sfrand/sfrand.htm*

It's a neat trick, but can not generate all floating point values. See my other comment for that.
The source of disagreement is: his code generates floating point approximations of uniformly-distributed real values in [0,1]; not random floats greater than 0 and smaller than 1.

This distinction has little to do with the subtlety of floats. You can e.g. generate numbers up to 100 with

- 5d20 (5 dice of 20 sides),

- 20d5 or even

- 16d6+1d4.

[Edit: the point of this exercise: these are different representations of the numbers up to 100.]

But even assuming each die with D sides has equiprobable results with prob 1/D, these choices don't have the same probability distribution. To convince yourself of this, compare 1d12 (the probability of getting 1 is 1/12) with 2d6 (the probability of getting 1 is 0).

If you want max entropy and normally distributed values then, I think, there's no other way to go. Nice post btw
What do you mean by "normally distributed values"? Your reply also doesn't follow a normal distribution by the way. Did you mean uniform? For uniform values there is a way to go and my other reply exactly describes how.
Of course, it's uniform. I always swap the two names randomly hehe. I understand your code. The difference with the snippet I posted (which, btw, is not mine) is that the difference between successive floats is always a constant, i.e. there is no rounding. It's fast, has 23 bits of entropy and it's suited for constrained platforms/small demos [0], since it can be coded with very few asm instructions.

I'm not saying that your solution hasn't its merits, well done.

[0] The author of the code is a known member of the demoscene

Oh, I know very well who iq is :) And for a visual demo it might make perfect sense to sacrifice completeness of a solution for speed / code size. But to me, "pick a random number from {k/2^23 : 0 <= k < 2^23, k ∈ ℕ}" and "pick a random number uniformly at random from [0, 1)" are not the same.

All operations (add/sub, mul/div, sin, exp, log, etc...), with IEEE754 floats are defined using real arithmetic, and then assumed to round to the nearest representable floating point value. I don't see why random number generation, at least as a default before micro-optimizing, should be any different.