Hacker News new | ask | show | jobs
by acqq 2933 days ago
The way the mentioned PRNGs "fail" is when testing just the lower bits (search for the occurrences of "lsb" in [1] above) and this may be important in your use cases or not. The same [1] claims in "Visual Summary" that the "cycles/byte" is 1 for various PRNGs but http://xoshiro.di.unimi.it/ seems to show that the reason splitmix64 is not preferred everywhere is that xoroshiro128+ is roughly two times faster than splitmix64 .

Regarding having lower bits poor statistically, it was known since forever that that is the case for the huge class of simple PRNGs (effectively all that are faster than the alternatives, unless maybe if there's some specialized instruction in the CPU), the question is if that is critical or not for your purposes. The author of xoroshiro128+ is of course aware of that issue, and he also writes:

"For general usage, one has to consider that its lowest bits have low linear complexity and will fail linearity tests; however, low linear complexity can have hardly any impact in practice, and certainly has no impact at all if you generate floating-point numbers using the upper bits (we computed a precise estimate of the linear complexity of the lowest bits)."

In short, if you don't know how you're going to use the PRNG and you don't have problems related to speed, sure, use the safest one. Note that "safety" is still differently understood in different use cases, e.g. take care to note that most of fast PRNGs still aren't cryptographically secure:

https://en.wikipedia.org/wiki/Cryptographically_secure_pseud...

and that sometimes even "standardized" "cryptographically secure" turn out to be something else, e.g. the subtitle on that wikipedia page:

"NSA kleptographic backdoor in the Dual_EC_DRBG PRNG"

Also other considerations come into play when you have some specific needs and you understand the consequences: then it's not only black-and-white "safe" v.s. "not safe." For some purposes (as the mentioned generation of the floating-point numbers in some use cases) speed matters enough to sacrifice some "perfectness."

2 comments

> The same [1] claims in "Visual Summary" that the "cycles/byte" is 1 for various PRNGs but http://xoshiro.di.unimi.it/ seems to show that the reason splitmix64 is not preferred everywhere is that xoroshiro128+ is roughly two times faster than splitmix64 .

I have tested xoroshiro128+ vs splitmix64 in several procedural generation & simulation code bases in C and Swift. I could never confirm the numbers on http://xoshiro.di.unimi.it/. In fact, splitmix64 was slightly faster in all my tests with different optimizations enabled. I always assumed that's because its state only occupies a single register which certainly matters in practical applications (especially in C with its restricted calling conventions). I am not absolutely sure whether that was always the reason, though.

I use fast RNGs for kernel projections, FHT-accelerated JL transforms, and data generation for numerical experiments. I don’t need cryptographic security for these purposes.
And if you generate floating point numbers maybe you don’t have to worry about lsbs alone either? The “failed” tests drop away the bits that matter the most when floating point randomness is constructed.
I don’t know the details on which bits matter in the ziggurat algorithm, which is the one I use. Is this the case for all floating point random number generators?
The PRNGs you mentioned before all generate integers. The conversion from the integer PRNG to the floating point, and the conversion from one integer range to another range needed in the ziggurat algorithm both need to be done "right" to give correct results (I can imagine that even using splitmix64 the wrong implementation could be programmed by somebody not knowing what has to be done), so if you aren't sure about these steps you should surely check their quality yourself. If these are done right, I'd personally expect xoroshiro128+ could be "good enough" even when having that specific weakness (poorer quality when using only lsb bits) that you worried about. It's important, of course, not to drop the highest bits away.

And on another side, 2om3r questions the speed measurements of the author, and I think he has a point: the speed measurements should be made in the context of the real use, otherwise the compilers are able to "cheat" (optimize pieces of the code away) if the example is too simple.