| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gizmo686 2551 days ago
	There is a simpler solution if you do not mind leaving entropy on the table. Break the people up into groups of two without regard for their selection. If two people provide the same number, ignore them. Otherwise, output "0" if the first person's number is smaller then the second, and output "1" if the first person's number is larger. From this, you have a sequence of uniformly random bits, from which you can construct a uniformly random number between 1 and 10. A simple way (although, again, likely leaving entropy on the table) is to break these bits into groups of 4; interperate them as 4 bit unsigned integers, and discard any result that is not in the range 1-10.

4 comments

FabHK 2551 days ago

That works and is perfectly uniform (if the people are i.i.d.), but requires quizzing around 20, 40, 60 or more people for a number before you can deliver one, while the algorithm described only requires one or two.

EDIT: Not quite that many, more like around 9, 18, 27 or so, see below.

link

jancsika 2551 days ago

I'm not sure I understand.

Humans aren't like weighted dice. For example, it's certainly conceivable that a few pairs of these humans misunderstand the goal such that they have a 0% chance of ever choosing numbers that change the respective ordering. Add to that the higher probability that many of these pairs of humans will just toggle their orders each round.

Edit: clarification-- the humans don't even have to misunderstand the rule, just the upshot of the process.

link

mongol 2551 days ago

Maybe you don't even need to ask them to think of a number. The person whose name is sorted before the other person's name is person 1. And they compare their lengths instead of thinking of a number?

link

pdpi 2551 days ago

That gives you a single bit of entropy from those two people instead of a steady stream, though.

link

hcs 2551 days ago

That's a good point, I guess you have to also randomly choose the humans?

link

dmurray 2551 days ago

I don't think so, if they were all lined up in alphabetical order and you took them in pairs from that ordering, it would be just as good.

link

pteraspidomorph 2551 days ago

But then you'll always get the same output from the same set of humans. It's a PRNG where the set of humans is the seed.

link

frankmcsherry 2551 days ago

Nonsense. It requires roughly 1+log(10) bits in expectation and has a geometric tail.

link

FabHK 2551 days ago

Eh? 4 bits minimum to get a uniform 1-16, and each bit requires requires 2 numbers that are not identical. (Ah yeah, I thought that in turn requires a bit more than 4 numbers, but it requires only a bit more than 2 numbers (probability that numbers coincide is not 1/2, as with bits, but 1/10, or a bit more than that due to non-uniformity)).

Thus, I recant "around 20, 40, 60 or more" and replace it by "around 9, 18, 27 or more" numbers. Probability that you have to reject any resulting hex digit is obviously 3/8.

link

frankmcsherry 2551 days ago

Rejection sampling isn't the best way to go.

With b bits you are subdividing the (0,1) interval into 2^b regions, and only need more bits if one of the 9 multiples of 0.1 land in the interval you've picked. As b increases this probability drops exponentially.

It's a fair point that the number of "numbers" depends on how low the entropy of the source is, but in the link the probability of collision wasn't massive.

link

FabHK 2551 days ago

But the original post described rejection sampling (once you got your bits from the larger/smaller trick).

link

frankmcsherry 2551 days ago

Absolutely! I read the von Neumann extractor and presumed too much. For fun, you can reduce the numbers asked by getting a batch of k people, and if all numbers are dissimilar you get a draw from k!

The OP is silly though; if you know the distribution there are way better techniques. They don't seem to worry about how they measure it.

link

rbmktechik 2551 days ago

Parents algorithm always works, the one described has a massive failure mode - people slowly learn that they are likely to say 7 and self-censor by picking another number. Or maybe in a different culture the distribution changes (say China and the numbers 4, 8).

link

bnegreve 2551 days ago

> the one described has a massive failure mode

Well, if you're really interested in generating random numbers using a group of people, then yes, this may be a problem.

But you should take this as an illustration of the more general problem of generating uniform random numbers from a distribution that is not.

If you think about it this way, author's solution can be applied in a number of practical scenarios. For example, if you want to generate a hash values for objects using properties that are not uniformly distributed. Or if you want to generate random numbers from a physical artifact that is not perfectly uniform.

link

OscarCunningham 2551 days ago

I would guess that in China the probabilities of 4 and 8 goes down. Asking people to pick randomly drives them away from special numbers.

link

Ill_ban_myself 2551 days ago

In the US 7 is a lucky # and a statistical outlier in that it was more frequently chosen in the article above.

link

OscarCunningham 2551 days ago

You might be right, but my guess is that 7's reputation for luck isn't famous enough to make a difference, and in fact it's common because other numbers seem too "unrandom".

We could test by asking people to pick numbers less than 100. I bet people would focus on odd numbers, especially those greater than 50, not ending in 5 or not having both digits the same.

link

sokoloff 2551 days ago

I decided to do that.

Data and light analysis is here: https://docs.google.com/spreadsheets/d/1Dh0wiTCRkBhckWGXtjZg...

I definitely overpaid on Mechanical Turk per response, given how lightning quickly the data came in. (I decided to pay $0.10/HIT for 50 responses and got 68 responses in 8m26s.) I suspect that offering a nickel would have gotten the survey filled in under half an hour still...

link

FabHK 2551 days ago

Excellent point. The algorithm in the article is predicated on a specific distribution and iid, GP algorithm is predicated only on iid.

link

hcs 2551 days ago

That's pretty good, as long as each person can't hear any of the previous numbers! The article has the same issue, and might even be biased by the second person knowing they're the second person.

link

leggomylibro 2551 days ago

Maybe have them write their numbers down and put them in a hat? Or hey, this is 2019, have them connect to a website or bluetooth beacon and enter a number on their phone?

Ooh, or I bet you could make a 10-digit keypad that wirelessly reports to a nearby Raspberry Pi for $2-5 in parts, sort of like those 'clickers' that universities use to quiz large lecture classes.

It's too bad that most cheap radio modules are so short-range, because it would be interesting to make something like that and nail them to telephone poles with notes asking people to press a button. You'd have to control for people doing things like hammering the same button repeatedly for a laugh, but it would be fun to see what happened.

Would you be allowed to use HAM bands in the US for that sort of thing if you made them like beacons which sent a callsign after the number value and device ID?

link

cbsks 2551 days ago

If you are picking numbers from a hat, why not just put in ten numbers and pick one?

link

hxtk 2551 days ago

Here, we see the difference between scientists and engineers ;)

link

leggomylibro 2551 days ago

Well, each group of two could use their own hat. Or they could just write the numbers down instead of saying it. I dunno, the point is they don't have to say them out loud.

link

sansnomme 2551 days ago

Just use the unlicensed ISM bands.

link

karmakaze 2551 days ago

Exactly. Not only aren't people random, they're predictable based on history[0]. Each person could only be asked for one number and they shouldn't even wait perceptibly different lengths of time to be asked.

[0] http://people.ischool.berkeley.edu/~nick/aaronson-oracle/ind...

link

bscphil 2551 days ago

I've tried this a few times (refreshing the page in between). After a hundred or so strokes the computer is consistently less than 50%, usually ~45%. Suggesting my brain has somehow spontaneously worked out whatever algorithm it's using to guess. I don't think I'm random but there's something too simplistic about the method on this page.

link

ssalka 2551 days ago

Works except when both people in every group give the same number

link

your-nanny 2551 days ago

does this algorithm have a name?

link

suf 2551 days ago

https://en.wikipedia.org/wiki/Randomness_extractor#Von_Neuma...

link

your-nanny 2545 days ago

thanks!

link