Polya Urn Simulation

Y	Hacker News new \| ask \| show \| jobs

	Polya Urn Simulation (observablehq.com)
	50 points by cmoog 1180 days ago

4 comments

bjornsing 1180 days ago

I must admit: this went against my intuition. My first guess was that you would end up with an urn full of either red or blue balls.

link

OscarCunningham 1179 days ago

Bear in mind the default behaviour if there were just two balls and you never added any more. Then the proportion of red picks vs blue picks would tend to 1/2. So there's naturally a tendency for the proportion to concentrate in the middle.

As you say, the way in which new balls are added tends to push the proportion towards the extremes.

The uniform distribution is the result of these two tendencies exactly cancelling out.

link

cmoog 1179 days ago

This is not correct as stated. It does not “tend towards extremes” as you might expect intuitively. That would be the case if half the time it approached 100% blue and half the time it approached 100% red, which is precisely not what happens.

link

cmoog 1179 days ago

For me as well. And when my stochastic probability professor posed this question to the class by way of hands, it was nearly unanimous in favor of the 0/100% end behavior.

link

xeyownt 1180 days ago

Yeah, I don't know what my intuition was.

But the problem is symmetric, and even if pick a red, you end up with two reds and one blue, so not so much imbalanced.

And even if the mix becomes really imbalanced, say 7 red and one blue, picking the rarest color will have more effect than picking the most common one. So you could consider that the system tries to balance itself naturally, hence avoiding huge swings in some direction or the other.

link

foobarbecue 1180 days ago

Me too, but only because I was expecting something interesting to happen since it was on HN!

link

kgwgk 1179 days ago

The uniform distribution result is not interesting enough for you?

link

foobarbecue 1179 days ago

Oh, I see! Usually "this sort of thing" would have a normal distribution I guess? That IS quite interesting.

link

eru 1178 days ago

Usually 'this sort of thing' would go to the extremes: colours that are already prevalent have a bigger chance of getting more added to them.

It's interesting that they don't.

Normal distribution would be extremely weird and unexpected (and not even really possible): we know for sure that the proportion in the end has to be between 0% to 100%. Normal distributions don't have such cutoffs.

link

kgwgk 1178 days ago

On a log-odds (logistic) scale, though, we get something that resembles a normal (a logistic distribution).

link

theK 1180 days ago

Pretty sure the variables the author picked are not the most interesting ones.

Urn models are engineered to have a rich get richer bias which is best seen by varying the initial populations.

Instead of offering trial count and pick counts which are (invariates in the actual model) he could have picked initial ball count and initial white/red ratio.

link

cmoog 1179 days ago

Ah, good idea! I'll add those in a bit and will remark that the answer/proof are specific for the special case where r_0 = b_0 = 1.

link

planede 1179 days ago

The proof seems to concentrate on the marginal distribution as n goes to infinity. But the simulation hints at something more interesting: each sample of the random process seems to converge to a value, where the value itself is U(0,1).

Is it true that a sample of the random process is convergent with probability 1?

link

theK 1180 days ago

> After a large number of picks, what is the behavior of the proportion of red balls in the urn

Isn’t the more enticing question how strong the bias towards the first picked Color is?

link

fjfaase 1179 days ago

The rather boring answer is 2/3. Logic seems to indicate that it will be a linear distribution where the change for only balls of the first picked Color is maximum and only balls of the other Color is zero, because it is no longer possible to only pick balls of the other Color. It must be linear because it needs to be symmetric and lead to a uniform distribution if added together.

After two picks, you have three cases. If you picked two different colored balls, the distribution should be a uniform distribution again, just like the initial state. The two other distribution should mirror eachother, and thus be linear again.

Maybe that something interesting happens with three picks. Or maybe, you always end up with linear distributions with tilted slopes. In that case it is rather boring.

link

fjfaase 1178 days ago

I am mistaken. After two picks, the case with two of the same Color does not result in a linear distribution. You can easily check this by modifying the code.

link

kgwgk 1179 days ago

The expected terminal fraction is always equal to the current fraction. (Or is it?)

link

inimino 1179 days ago

Yes, it is.

link