Hacker News new | ask | show | jobs
by willis77 2095 days ago
The digits of pi contain every pdf that ever could and ever will exist.
8 comments

"Find the earliest valid pdf in consecutive digits of pi"
I mean, the answer is trivially zero, there exists a PDF-like structure somewhere in Pi, and the offset of that doesn't have to be zero, it can start or end anywhere. So the range [0, N] is a valid PDF.
"Find the last byte of the first valid PDF in the binary digits of Pi"
Since the PDF also doesn't have to be end-aligned, the answer is trivially [0, infinity].

The first place a valid PDF could be ended, perhaps.

A pdf at [0, N] sorts before the one at [0, N+1], by "first valid pdf".
No, both start at 0. Also, [0, infinity] and [0, infinity+1] are the same thing.
Please don't give them any ideas.. the whiteboard interview coding tests are hard enough as it is
How else will we weed out the fakers and people coasting for 10 years? Our CRUD SaaS app needs top people.
Doesn't sound like that hard of a question, given you are provided the structure of the PDF header. I guess it really comes down to substring search.
Imagine if it was a PDF that simply rendered the number 42.
If that happens we know for a fact that we are in a simulation
Well maybe. We don't know if pi is a normal number.
Actually it only needs to be a disjunctive (or rich) number which is a weaker condition.

We don't know whether pi is that either for any integer base.

> We don't know if pi is a normal number.

Sure we do. There are plenty of proofs out there that pi is an irrational number.

Irrational does not imply normal. For example, 1.01001000100001... is irrational but it's certainly not normal.
Technically, 1.01001000100001... can be normal depending on what ... stands for. :)
Well, obviously. But presumably the ... is meant to imply that this is the summation of 1/(10^(x(x+3)/2)).
Or what 1 or 0 or . stands for.
Actually I'd argue the example you provided is normal, as long as you authorise a particular encoding where every number n you're looking for is encoded as a string of n zeros.

It's then trivial to see that every number you can think of is encoded in there, and therefore any data, piece of music or movie that ever existed.

(I'm not sure we're allowed to fiddle with the encoding, but since we allow ourselves to represent a piece of music into a number, we're already talking about encoding anyway, so it doesn't seem like cheating to me...)

Normality of a number is with respect to number bases, so your trick with encoding is invalid. Otherwise, every computable number could be considered normal - take an algorithm for generating of it, supply a random string (this is the encoding), disregard the random string, and you have a perfectly valid normal representation of your number. So it is cheating.
I agree that normality is a specific formalized concept, but you could always require that an encoding function like this is injective.
Encoding doesn't count. Normality is a very specific mathematical concept: https://en.wikipedia.org/wiki/Normal_number

Also, 1.01001000100001... is a good example of a number that is both irrational and transcendental but not normal.

Normal in this sense means that all the frequency of all digits approaches a uniform distribution as the length of the sample increases towards infinity. Basically if we could see "all of" π and count all the 0s, 1s 2s, 3s, &c to 9 all the counts would be equal.
That on its own can't be right, because 0.12345678901234.....

According to wikipédia, you gave a definition for "simply normal", and for normal numbers the distribution of any sequence of digits is uniform. So 00, 01, ..., 99 each occur uniformally too.

Moreover you need to consider it with regards to all other bases than 10 too.
Is this correct, mathematically?

I understand the point that PI contains every possible piece of information, theoretically.

However, the chance of finding a given string in PI depends on the string’s length. The longer the string, the more the probability tends to 0.

The paradox therefore is that PI contains every PDF, but you will never find them, so in what sense does it really contain them at all?

No, all strings theoretically exist in 𝛑 given enough digits, so longer strings don't reduce probability of existence, they just mean that it will take more digits to find them.
See Borell-Cantelli lemma.
I looked this up but I’m not sure I grasp your point.

Are you saying that:

- given a long string, we might ask “can this string be found in PI?”

- the probability of finding a long string in PI is infinitely small

- the number of possible strings in PI is infinitely large

- it’s not possible to decide if the answer is yes or no?

If a tree falls in the forest and no one is around to hear it fall. Or a modern take, if a disease has no symptoms is it really a disease.
<citation needed>

Including a PDF that generates the digits of pi

actually, if you find the citation, let me know, you might be in for an award
I'm not sure that's necessarily true. It is true (at least with a non-constructive proof) that if you pick a 'random' real number then it contains all possible PDFs with probability one ( or that the set of numbers for which this is not true has lebesgue measure zero). But I'm not sure it's known that pi has this property.
Pi is thought to be normal but it hasn't been proven yet, so we can't say that for sure, but it's likely true.
I don't think that is a proven fact.
Since a PDF can begin with non-PDF content, then pi itself is a valid PDF file.