Hacker News new | ask | show | jobs
by crazygringo 2791 days ago
Thanks for the interest -- it's actually just a sum of the probabilities for the items from 1 to 10,000. For example, if there's a 0.1 chance you know each of 10 items, it adds up to a total value of 1 -- no cutoff needed.

Mathematically, there's a trick where you don't even need to compute the sum item-by-item... I calculate the binomial regression which gives me the two relevant parameters, from which I can calculate the probability density function (PDF) [1] for an item of given rank. Then I just calculate the associated cumulative distribution function (CDF) with the same two parameters [2] for rank 10,000 -- and that's the final result.

[1] https://en.wikipedia.org/wiki/Probability_density_function

[2] https://en.wikipedia.org/wiki/Cumulative_distribution_functi...