| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jsweojtj 2313 days ago
	The leading digits of a uniform distribution does not follow Benford's law. Look at: https://news.ycombinator.com/item?id=22340018 from elsewhere in this thread or https://news.ycombinator.com/item?id=21541264 of mine from a couple of months ago.

1 comments

EGreg 2313 days ago

Here, I wrote it up to explain more clearly:

http://magarshak.com/blog/?p=318

Your script only tests samples which are from ranges where the max is a power of 10.

I’m sorry to tell you this, but you inadvertently misled people with that empirical test. This just goes to show that we have to check our assumptions, as scientists or mathematicians trying to prove a statement. (Even with empirical tests :)

Hopefully this message will fix that at least for those people that are reading this thread! The rest will be confused. But that’s what happens in science all the time.

PS: I edited the original Wikipedia page with the explanation :)

link

mathduck 2313 days ago

You are artificially constructing a very limited range (upper bound and lower bound) in which of course the likelihood of any digit being the first digit won't be equal. This is nothing new. This does not contradict what other commenters have said about uniform distribution. With large ranges, even if you exclude a power of 10 in the upper bound, it does not change the 11.11% chance of each digit being the first digit.

But let us accept your very limited range for a moment and go along with it. Then you say that the numbers in this range follow Benford's law. But clearly, it doesn't. None of the probabilities in this range obey the probabilities in Benford's law.

Someone needs to revert the dubious edit (https://en.wikipedia.org/w/index.php?title=Benford%27s_law&d...) you have made in Wikipedia.

link

EGreg 2313 days ago

This is simply wrong, and creating a new green account and downvoting me isn’t going to change that.

It is trivial to see that literally any range with min = 0 and max = any number other than a power of 10 makes it LESS likely that a 9 will come up as the first digit. For example the range 0-300 has 1 and 2 come up as the first digit way more than the rest. Don’t you think the same is true of 0-30000 and 0-300000000000000000000000? The size of the range doesn’t make your assertion any more true, that for large ranges every leading digit begins to have an equal chance of appearing.

My point is that, given a uniform distribution from 0 to a max, it has to have a max somewhere. If we assume that max itself is uniformly distributed then we derive the proportions you find in Benford’s law.

Look to put it another way, Benford’s law comes from the numbers which are the same number of digits as the max. The rest are evenly distributed but those numbers are the most numerous at that point and they contribute the phenomenon. Ok?

Are you convinced?

PS: There has got to be someone who figured this out before 2020. Come on. Someone post a link to this derivation.

link

foo101 2313 days ago

> creating a new green account and downvoting me isn’t going to change that.

It is impossible on Hacker News for a new green account with less than 500 points to downvote someone else.

link

EGreg 2313 days ago

Do you admit this statement is wrong?

With large ranges, even if you exclude a power of 10 in the upper bound, it does not change the 11.11% chance of each digit being the first digit.

The empirical test is cherrypicked also.

If you don’t admit this then there is really no point to continue.

link

foo101 2313 days ago

Can you provide a concrete example of a range of numbers that you think obeys Benford's law?

link

jsweojtj 2313 days ago

This is exactly the question I was going to ask.

I wrote: > The leading digits of a uniform distribution does not follow Benford's law.

And @EGreg wrote: > I’m sorry to tell you this, but you inadvertently misled people with that empirical test. This just goes to show that we have to check our assumptions, as scientists or mathematicians trying to prove a statement. (Even with empirical tests :)

So, what specific range of the uniform distribution yields leading digits that follows Benford's law?

link

EGreg 2313 days ago

Literally any range with min = 0 and where the max isn’t a power of 10.

For example 0-300

One third of numbers are evenly distributed: 0-100

One third starts with 1: 100-200

One third starts with 2: 200-300

Do you understand?

link

jsweojtj 2312 days ago

I understand.

There is a distribution of leading digits that looks like:

As wikipedia says, "It has been shown that this result applies to a wide variety of data sets, including electricity bills, street addresses, stock prices, house prices, population numbers, death rates, lengths of rivers, physical and mathematical constants."

Neat! For each of those data sets you get the same distribution. Now, someone (I won't say who), says that it also is true for the uniform distribution.

But it isn't.

It simply isn't.

And I said as much when I said, "The leading digits of a uniform distribution does not follow Benford's law."

And your counter example is if you take a uniform distribution from 0-300, the leading digits go to something like:

Great, so I don't know how we can disagree at this point. The above distribution is not Benford's Law.

> "The leading digits of a uniform distribution does not follow Benford's law." -- me

And you, directly disagreeing with that correct statement:

> This just goes to show that we have to check our assumptions, as scientists or mathematicians trying to prove a statement. -- EGreg

Indeed.

link

BenoitEssiambre 2313 days ago

That's not Benford's law though. That's just a weird distribution due to a weird cutoff.

Bensford's law is 1:30.1%, 2:17.6% 3:12.5% etc.

link

EGreg 2313 days ago

For the record you’re changing the goalposts. The op claimed that his example proves that the digits always have the same chance of appearing, which is clearly false.

When the max is uniformly distributed then Benford’s law emerges. I mean, all you have to do is read the link - where I derive it.

What exactly is the law — please don’t handwave. If the law is those exact point values mentioned in the article then I just showed you how we arrived at them.

link

susam 2313 days ago

The numbers in the range 0-300 do not obey Benford's law. In base 10, a set of numbers that Benford's law if the leading significant digit d (0 < d < 10) occurs with probability log10(1 + 1/d). This isn't the case for the set of numbers between 1 and 300, inclusive.

link

EGreg 2313 days ago

Your assertion that for large ranges every digit has the same chance of appearing is very wrong. Your empirical test is rigged by choosing a very rare max, literally the only one where it would “prove” your assertion.

Benford’s law appears when the max of your range is uniformly distributed

link