Hacker News new | ask | show | jobs
by MrBuddyCasino 2970 days ago
And thats the part I still don't understand.
4 comments

Think of it this way: you are always in a chain. It can be between 1 and 100 boxes long. But it can’t be a dead end or not include the 1-note. Worse case it visits all boxes and gets back to 1.

Try forming a scenario that doesn’t. E.g a “loop”: Prisoner #1 opens box 1. Finds the number 2. Opens box 2 finds the number 3. Opens box 3... Could he now find the “2” that would lead to him being stuck in the 2-3 loop? No. He already found the 2. It can’t appear again. In the third box he’ll find a number he hasn’t seen. The loop cannot form in any other way than to come to where he started - which it will do when he finds the note with “1” on it.

So by starting with box 1, he is on a loop of length 1..100 including the pointer TO box 1 (which is what he is looking for).

Ok, so I‘m walking a linked list that will lead to my number. But I still don’t get why the odds are better than just opening the first 50 boxes and hope for the best? What changes because I‘m on the chain?
OK I'll give it another go :)

The chance of you finding your number with 50 random opens is 50%. But you don't live jusr because you find your number. You only live if everyone finds their number. And that chance is slim: it's .5^100 which is a very small number.

The chance of finding your number using the chain algo is the same as the chance that your number is found before the 50th step of your chain. That probability is still 50% because you are opening at most 50 boxes of a 100, the order doesn't matter. This is why you feel nothing will change, that the order shouldn't matter. That is correct: each prisoner opens 50 boxes without any prior information.

But the key is what changes for the probability of success for the group.

All prisoners use the same shuffling (it's only randomized once). Everyone will succeed when walking their chains, if the longest chain is shorter than 50. That is, if there are two chains of 60,40 then the group will fail. Because some of the prisoners who are unlucky and have numbers in the 60 loop will to reach their number in 50 steps. But if there are two chains of 50,50 or 3 chains 30,30,40 etc., then everyone will succeed.

So the difference is that with random choice, YOUR chance of finding YOUR number is 50%. The chance that all 100 does so you live, is .5^100.

With the chain walk, YOUR chance is still 50%, as is everyone elses. Yet the risk that someone fails is now just 70% instead 99.9999999...% (Now is no multiplication - the probabilities are not independent - the probability of success for the group is now a property of the loops rather than the product of 100 independent events)

I think this explanation makes the most sense, since it explains the non-independence of the player's actions. Another way I thought of it was this: imagine (despite the prisoners' knowledge) each prisoner actually had the same number. The "choose at random" strategy would still have a vanishingly small probably of success, while (any) deterministic strategy would yield a 50% rate of group success.
Imagine Strategy A, where everyone started at box 1 and checked 1-50. Half of the prisoners would succeed, guaranteed, but half would fail, guaranteed, and prisoners would never win.

A slightly better Strategy B is to start at their own box and all subsequnt ones, wrapping around at 100. Now you have at least a small chance, but each prisoner winning independently is low.

What cycle method Strategy C does is group up failures. When prisoners fail because there's a cycle of size 51+, 51 fail. When there's a cycle of 99 boxes, all prisoners fail. But in exchange, there's more possibilities where there aren't cycles that long.

So it's not that the odds are better for an individual as much as the odds are better that the individual's successes coincide. In Strategy A, the success of an individual prisoner decreases the estimated success probability for subsequent prisoners. In Strategy B, that decreased probability is nullified by incrementing the starting location by one step. In strategy C, a prisoner's success or failure amplifies the estimated success probability for others. Not by changing any state, but revealing the state itself.

The success criteria goes from the prisoners randomly picking correctly to the warden not randomly creating a loop larger than 50. If the warden creates no loop greater than 50 then the prisoners win. If he creates one of 51 or more then the prisoners lose because the 51st number wont be found.
That was the shortest description that worked for me. Framed this way, its obvious - thanks, also to all the others!
Think of it this way - if all the boxes are sorted then everyone finds their own number in their box.

But let's say that everything is sorted except your's and some other prisoner's number. Then it makes sense to follow the chain, because there is only one chain of two nodes.

Basically any number that is in its place is sorted and thus not part of a chain. Any number that is not in its place is not sorted and part of a chain.

There can be many different chains, but only one chain that leads to your number. By starting from your own box, you guarantee that you follow your own chain thus reducing the search space.

Maybe it helps to understand...

> If just one prisoner does not find his number, all prisoners die.

You're trying to find a strategy that is successful for all prisoners. Following the chain reduces the problem to "is there a cycle of length >50", which can be calculated easier.

Opening boxes randomly (or arbitrarily, as in "50 boxes starting with my number), by all prisoners, is 50%100 which as mentioned is vanishingly small. The trick is that chain-following actually gives information to each prisoner as they go, which helps leverage into a better win chance.

Opening 50 boxes at random gives you a 50% chance to hit the right box. But everybody else also has a 50% chance that’s independent from yours, so you just multiply the probabilities together to figure out the overall probability. This makes it vanishingly unlikely to get it right with a lot of people.

This strategy instead has you all collectively betting that the structure of the permutation is favourable, which doesn’t have that multiplicative problem.

Assuming that linked list is shorter than 50, you're guaranteed to find your own number.
This strategy weeds out chains that don't contain your number.

If there are 3 chains, one contains your number, and two are loops that don't contain your number, this strategy necessarily starts you in the chain that contains your number.

The goal isn't to maximize an individual's odds of finding their number, but maximizing the odds of the entire group finding their number.
So, following your cycle looks like this:

    Open box 7, find number 2
    Open box 2, find number 2700
    Open box 2700, find number -16
    Open box -16, find number 0.9
    Open box 0.9, find number ℵ
    Open box ℵ, find number 赢
    Open box 赢, find number two
    ...

As you can see, it's not necessary to use numeric digits to identify the boxes; any identifier at all will do.

But this is a cycle; if we had started by opening box 0.9, it would look like this:

    Open box 0.9, find number ℵ
    Open box ℵ, find number 赢
    Open box 赢, find number two
    ...
    Open box 7, find number 2
    Open box 2, find number 2700
    Open box 2700, find number -16
    Open box -16, find number 0.9
Exercise: given that this cycle starts with "open box 7", how does it end? (Or in other words, what is the second half of the line before "Open box 7"?) If this cycle instead started with "Open box ᚠ", how would it end in that case?
If you consider that each box only has one box that "points" to it, and that last box must necessarily be part of the box's cycle (by definition, there's no way to avoid it as if you hypothetically swapped the last box to point somewhere else then all you're doing is linking the cycles and you'll end up in the same place but later)
For me it helped to think of it as a collection of linked lists. At most, there can be 100 lists of length 1, where each number points to itself. On the other extreme, there are many ways to create a list of length 100.

The strategy takes advantage of the fact that randomly permuting the numbers is less likely to create one or two long lists, and more likely to create a bunch of shorter lists.

Since each prisoner knows which list he belongs to, he can use that information to reduce his search space from all of the boxes to just the boxes in his list.