Hacker News new | ask | show | jobs
by Someone 903 days ago
In C (and your Python conversion), the i < Random(1, 100) part is evaluated each time through the loop, not once at start of the loop to determine the limit.

So, it’s 1% that the loop ends at i = 1, if it takes that hurdle 2% that it ends at i = 2, if it takes that hurdle 3% that it ends at i = 3, etc.

The calculation is easier if you phrase that this way:

It’s 99% that the loop continues at i = 1, if it takes that hurdle 98% of the rest that it continues at i = 2, if it takes that hurdle 97% that it continues at i = 3, etc.

So, the probability to make it past i = n is

  0,99 × 0,98 × 0,97 × … × (1 - n/100)
Note that this isn’t necessarily the case in all languages. Pascal, for example, has a real for loop where the limit is evaluated once and the index variable cannot be changed inside the loop, so that the compiler can determine the number of iterations before starting the first iteration.
1 comments

I was aware that the Random function call gets evaluated in Python (and C) every time the loop iterates, I just couldn't imagine the probability distribution myself, I had assumed that all numbers are uniformally distributed but didn't cater for the sums. You're a legend! Now I see it. I wrote the following code to check out your argument and it checks out indeed :thumbs-up:

  def p(n):
    total = 1
    for i in range(1, n):
      total *= (100-i)/100
    return total


  for i in range(1, 100):
    print(f"p({i}) = {p(i)}")

The first 13 results:

  p(1) = 1
  p(2) = 0.99
  p(3) = 0.9702
  p(4) = 0.9410939999999999
  p(5) = 0.9034502399999998
  p(6) = 0.8582777279999998
  p(7) = 0.8067810643199997
  p(8) = 0.7503063898175998
  p(9) = 0.6902818786321918
  p(10) = 0.6281565095552946
  p(11) = 0.5653408585997651
  p(12) = 0.503153364153791
  p(13) = 0.44277496045533604
p(12) sits at 50%, of course it will be the mean! :D
> p(12) sits at 50%, of course it will be the mean!

Technically, “the mean is 12” does not follow at all from “p(12) sits at 50%”.

“p(12) sits at 50%” implies the median (https://en.wikipedia.org/wiki/Median) is 12, and that can differ from the mean (https://en.wikipedia.org/wiki/Arithmetic_mean) of the distribution, and the difference can be quite large.

For example, if that list were to continue

  p(13) = 0.44277496045533604
  p(14) = 0.44277496045533604
  p(15) = 0.44277496045533604
  p(16) = 0.44277496045533604
  p(17) = 0.44277496045533604
  p(18) = 0.44277496045533604
  …
  p(1000) = 0.44277496045533604
  P(1001]) = 0
the about 44% that makes it to 13 also will make it to 1000, and the mean value will be a bit more than 442.77496045533604 (the contribution of that 44.2%)

If, on the other hand, it were to continue

  p(13) = 0.44277496045533604
  p(14) = 0.0
No result would be higher than 13, and the mean would be lower than 12.