Hacker News new | ask | show | jobs
by gizmo686 2551 days ago
There is a simpler solution if you do not mind leaving entropy on the table. Break the people up into groups of two without regard for their selection. If two people provide the same number, ignore them. Otherwise, output "0" if the first person's number is smaller then the second, and output "1" if the first person's number is larger.

From this, you have a sequence of uniformly random bits, from which you can construct a uniformly random number between 1 and 10. A simple way (although, again, likely leaving entropy on the table) is to break these bits into groups of 4; interperate them as 4 bit unsigned integers, and discard any result that is not in the range 1-10.

4 comments

That works and is perfectly uniform (if the people are i.i.d.), but requires quizzing around 20, 40, 60 or more people for a number before you can deliver one, while the algorithm described only requires one or two.

EDIT: Not quite that many, more like around 9, 18, 27 or so, see below.

I'm not sure I understand.

Humans aren't like weighted dice. For example, it's certainly conceivable that a few pairs of these humans misunderstand the goal such that they have a 0% chance of ever choosing numbers that change the respective ordering. Add to that the higher probability that many of these pairs of humans will just toggle their orders each round.

Edit: clarification-- the humans don't even have to misunderstand the rule, just the upshot of the process.

Maybe you don't even need to ask them to think of a number. The person whose name is sorted before the other person's name is person 1. And they compare their lengths instead of thinking of a number?
That gives you a single bit of entropy from those two people instead of a steady stream, though.
That's a good point, I guess you have to also randomly choose the humans?
I don't think so, if they were all lined up in alphabetical order and you took them in pairs from that ordering, it would be just as good.
But then you'll always get the same output from the same set of humans. It's a PRNG where the set of humans is the seed.
Nonsense. It requires roughly 1+log(10) bits in expectation and has a geometric tail.
Eh? 4 bits minimum to get a uniform 1-16, and each bit requires requires 2 numbers that are not identical. (Ah yeah, I thought that in turn requires a bit more than 4 numbers, but it requires only a bit more than 2 numbers (probability that numbers coincide is not 1/2, as with bits, but 1/10, or a bit more than that due to non-uniformity)).

Thus, I recant "around 20, 40, 60 or more" and replace it by "around 9, 18, 27 or more" numbers. Probability that you have to reject any resulting hex digit is obviously 3/8.

Rejection sampling isn't the best way to go.

With b bits you are subdividing the (0,1) interval into 2^b regions, and only need more bits if one of the 9 multiples of 0.1 land in the interval you've picked. As b increases this probability drops exponentially.

It's a fair point that the number of "numbers" depends on how low the entropy of the source is, but in the link the probability of collision wasn't massive.

But the original post described rejection sampling (once you got your bits from the larger/smaller trick).
Absolutely! I read the von Neumann extractor and presumed too much. For fun, you can reduce the numbers asked by getting a batch of k people, and if all numbers are dissimilar you get a draw from k!

The OP is silly though; if you know the distribution there are way better techniques. They don't seem to worry about how they measure it.

Parents algorithm always works, the one described has a massive failure mode - people slowly learn that they are likely to say 7 and self-censor by picking another number. Or maybe in a different culture the distribution changes (say China and the numbers 4, 8).
> the one described has a massive failure mode

Well, if you're really interested in generating random numbers using a group of people, then yes, this may be a problem.

But you should take this as an illustration of the more general problem of generating uniform random numbers from a distribution that is not.

If you think about it this way, author's solution can be applied in a number of practical scenarios. For example, if you want to generate a hash values for objects using properties that are not uniformly distributed. Or if you want to generate random numbers from a physical artifact that is not perfectly uniform.

I would guess that in China the probabilities of 4 and 8 goes down. Asking people to pick randomly drives them away from special numbers.
In the US 7 is a lucky # and a statistical outlier in that it was more frequently chosen in the article above.
You might be right, but my guess is that 7's reputation for luck isn't famous enough to make a difference, and in fact it's common because other numbers seem too "unrandom".

We could test by asking people to pick numbers less than 100. I bet people would focus on odd numbers, especially those greater than 50, not ending in 5 or not having both digits the same.

I decided to do that.

Data and light analysis is here: https://docs.google.com/spreadsheets/d/1Dh0wiTCRkBhckWGXtjZg...

I definitely overpaid on Mechanical Turk per response, given how lightning quickly the data came in. (I decided to pay $0.10/HIT for 50 responses and got 68 responses in 8m26s.) I suspect that offering a nickel would have gotten the survey filled in under half an hour still...

Excellent point. The algorithm in the article is predicated on a specific distribution and iid, GP algorithm is predicated only on iid.
That's pretty good, as long as each person can't hear any of the previous numbers! The article has the same issue, and might even be biased by the second person knowing they're the second person.
Maybe have them write their numbers down and put them in a hat? Or hey, this is 2019, have them connect to a website or bluetooth beacon and enter a number on their phone?

Ooh, or I bet you could make a 10-digit keypad that wirelessly reports to a nearby Raspberry Pi for $2-5 in parts, sort of like those 'clickers' that universities use to quiz large lecture classes.

It's too bad that most cheap radio modules are so short-range, because it would be interesting to make something like that and nail them to telephone poles with notes asking people to press a button. You'd have to control for people doing things like hammering the same button repeatedly for a laugh, but it would be fun to see what happened.

Would you be allowed to use HAM bands in the US for that sort of thing if you made them like beacons which sent a callsign after the number value and device ID?

If you are picking numbers from a hat, why not just put in ten numbers and pick one?
Here, we see the difference between scientists and engineers ;)
Well, each group of two could use their own hat. Or they could just write the numbers down instead of saying it. I dunno, the point is they don't have to say them out loud.
Just use the unlicensed ISM bands.
Exactly. Not only aren't people random, they're predictable based on history[0]. Each person could only be asked for one number and they shouldn't even wait perceptibly different lengths of time to be asked.

[0] http://people.ischool.berkeley.edu/~nick/aaronson-oracle/ind...

I've tried this a few times (refreshing the page in between). After a hundred or so strokes the computer is consistently less than 50%, usually ~45%. Suggesting my brain has somehow spontaneously worked out whatever algorithm it's using to guess. I don't think I'm random but there's something too simplistic about the method on this page.
Works except when both people in every group give the same number
does this algorithm have a name?
thanks!