Hacker News new | ask | show | jobs
by guytv 213 days ago
If you take the UTF-8 binary for “hello world” and paste it there, it passes 4 out of 5 randomness tests.

Strange.

(0110100001100101011011000110110001101111001000000111011101101111011100100110110001100100)

2 comments

It is very easy for short strings to pass most of the tests.
Yes I tried with PHP and it failed with a size of 8800 for the Block Frequency Test, but it was fine at 880. Then I tried another random sequence of 8800 and it also failed the Autocorrelation Test.
It looks like there is some repetition in the binary representation to me. English language phrases in UTF-8 are not going to look random.