Your script only tests samples which are from ranges where the max is a power of 10.
I’m sorry to tell you this, but you inadvertently misled people with that empirical test. This just goes to show that we have to check our assumptions, as scientists or mathematicians trying to prove a statement. (Even with empirical tests :)
Hopefully this message will fix that at least for those people that are reading this thread! The rest will be confused. But that’s what happens in science all the time.
PS: I edited the original Wikipedia page with the explanation :)
You are artificially constructing a very limited range (upper bound and lower bound) in which of course the likelihood of any digit being the first digit won't be equal. This is nothing new. This does not contradict what other commenters have said about uniform distribution. With large ranges, even if you exclude a power of 10 in the upper bound, it does not change the 11.11% chance of each digit being the first digit.
But let us accept your very limited range for a moment and go along with it. Then you say that the numbers in this range follow Benford's law. But clearly, it doesn't. None of the probabilities in this range obey the probabilities in Benford's law.
This is simply wrong, and creating a new green account and downvoting me isn’t going to change that.
It is trivial to see that literally any range with min = 0 and max = any number other than a power of 10 makes it LESS likely that a 9 will come up as the first digit. For example the range 0-300 has 1 and 2 come up as the first digit way more than the rest. Don’t you think the same is true of 0-30000 and 0-300000000000000000000000? The size of the range doesn’t make your assertion any more true, that for large ranges every leading digit begins to have an equal chance of appearing.
My point is that, given a uniform distribution from 0 to a max, it has to have a max somewhere. If we assume that max itself is uniformly distributed then we derive the proportions you find in Benford’s law.
Look to put it another way, Benford’s law comes from the numbers which are the same number of digits as the max. The rest are evenly distributed but those numbers are the most numerous at that point and they contribute the phenomenon. Ok?
Are you convinced?
PS: There has got to be someone who figured this out before 2020. Come on. Someone post a link to this derivation.
I wrote:
> The leading digits of a uniform distribution does not follow Benford's law.
And @EGreg wrote:
> I’m sorry to tell you this, but you inadvertently misled people with that empirical test. This just goes to show that we have to check our assumptions, as scientists or mathematicians trying to prove a statement. (Even with empirical tests :)
So, what specific range of the uniform distribution yields leading digits that follows Benford's law?
As wikipedia says, "It has been shown that this result applies to a wide variety of data sets, including electricity bills, street addresses, stock prices, house prices, population numbers, death rates, lengths of rivers, physical and mathematical constants."
Neat! For each of those data sets you get the same distribution. Now, someone (I won't say who), says that it also is true for the uniform distribution.
But it isn't.
It simply isn't.
And I said as much when I said, "The leading digits of a uniform distribution does not follow Benford's law."
And your counter example is if you take a uniform distribution from 0-300, the leading digits go to something like:
For the record you’re changing the goalposts. The op claimed that his example proves that the digits always have the same chance of appearing, which is clearly false.
When the max is uniformly distributed then Benford’s law emerges. I mean, all you have to do is read the link - where I derive it.
What exactly is the law — please don’t handwave. If the law is those exact point values mentioned in the article then I just showed you how we arrived at them.
The numbers in the range 0-300 do not obey Benford's law. In base 10, a set of numbers that Benford's law if the leading significant digit d (0 < d < 10) occurs with probability log10(1 + 1/d). This isn't the case for the set of numbers between 1 and 300, inclusive.
Your assertion that for large ranges every digit has the same chance of appearing is very wrong. Your empirical test is rigged by choosing a very rare max, literally the only one where it would “prove” your assertion.
Benford’s law appears when the max of your range is uniformly distributed
http://magarshak.com/blog/?p=318
Your script only tests samples which are from ranges where the max is a power of 10.
I’m sorry to tell you this, but you inadvertently misled people with that empirical test. This just goes to show that we have to check our assumptions, as scientists or mathematicians trying to prove a statement. (Even with empirical tests :)
Hopefully this message will fix that at least for those people that are reading this thread! The rest will be confused. But that’s what happens in science all the time.
PS: I edited the original Wikipedia page with the explanation :)