Hacker News new | ask | show | jobs
by rbmktechik 2551 days ago
Parents algorithm always works, the one described has a massive failure mode - people slowly learn that they are likely to say 7 and self-censor by picking another number. Or maybe in a different culture the distribution changes (say China and the numbers 4, 8).
3 comments

> the one described has a massive failure mode

Well, if you're really interested in generating random numbers using a group of people, then yes, this may be a problem.

But you should take this as an illustration of the more general problem of generating uniform random numbers from a distribution that is not.

If you think about it this way, author's solution can be applied in a number of practical scenarios. For example, if you want to generate a hash values for objects using properties that are not uniformly distributed. Or if you want to generate random numbers from a physical artifact that is not perfectly uniform.

I would guess that in China the probabilities of 4 and 8 goes down. Asking people to pick randomly drives them away from special numbers.
In the US 7 is a lucky # and a statistical outlier in that it was more frequently chosen in the article above.
You might be right, but my guess is that 7's reputation for luck isn't famous enough to make a difference, and in fact it's common because other numbers seem too "unrandom".

We could test by asking people to pick numbers less than 100. I bet people would focus on odd numbers, especially those greater than 50, not ending in 5 or not having both digits the same.

I decided to do that.

Data and light analysis is here: https://docs.google.com/spreadsheets/d/1Dh0wiTCRkBhckWGXtjZg...

I definitely overpaid on Mechanical Turk per response, given how lightning quickly the data came in. (I decided to pay $0.10/HIT for 50 responses and got 68 responses in 8m26s.) I suspect that offering a nickel would have gotten the survey filled in under half an hour still...

Maybe 7 cents?
Excellent point. The algorithm in the article is predicated on a specific distribution and iid, GP algorithm is predicated only on iid.