Hacker News new | ask | show | jobs
by etruong42 5469 days ago
It is less than twice as hard to find both stones when you threw 2 stones than it is to find the only stone you threw when you threw only 1.

Suppose you are the guy looking for the stones. There are two stones in the desert. Everything being random but equal, you are twice as likely to run into a stone when there are two than when there is only one stone in the desert. Once you find the first stone, it is equally difficult to find the second stone as it is to find only one stone at the beginning (if you treat "finding a stone" as independent events where you don't learn about the location of subsequent stones).

So while the idea is interesting, the analogy is poor. I much prefer the wikipedia explanation which is similar to yours but much more logically rigorous: http://en.wikipedia.org/wiki/Benfords_law#Outcomes_of_expone...

Response to update: Now I feel that you are convoluting your analogy. Can multiple stones occupy the same square? How is it appropriate to equate/compare "the number of squares you walk through in order to pick up all the stones" to "the number of times a digit should show up"? I apologize, but your illustration has become completely lost to me.

1 comments

> Can multiple stones occupy the same square?

You're right I should have clarified. If multiple stones could not occupy the same square, the odds would remain as I first explained them (3x, etc.). I think in my stones analogy and real life, stones should be able to occupy the same square. In fact, there should be a positive correlation (ie, given that there's a rock in this square, odds of a second rock being there go up).

> How is it appropriate to equate/compare "the number of squares you walk through in order to pick up all the stones" to "the number of times a digit should show up"?

The odds of coming across 3 units of a quantity are 3x as hard as coming across 1 unit. When we write numbers, we are either:

1) writing a truthful description of how many units we see/own/ate/taste/touch etc. (I ate 2 bagels, I earned $5, I ran 10 miles.)

2) lying.

By "lying", I'm including things like writing a novel. Maybe a better word is "imagining". With numbers, we are either writing down true observations or we are imagining them. It's just as easy to "imagine" $9 million in your bank account as it is to "imagine" $1 million, while truthfully finding $9 million in your bank account is a lot more difficult :). This is why Benford's law doesn't apply for "imagined" numbers. By using Benford's law, you can quickly classify a number set into either "real" or "imagined".

Ah. But there's the rub. What I find unintuitive about Benford's law is the non-random distribution of the most significant digit regardless of base or unit of measure. You propose that it's the "largeness" of a number that enforces Benford's law. While that may be in some ways true, it does not explain the transparency to base or unit of measure. You ate 2 bagels? I ate 4 half-bagels. You earned $5? I earned ¥600 motherfucker! You ran 10 miles? I ran 3bf3e6800 micrometers, in base-16!

Again, your train of thought is not necessarily wrong, but I still find the wikipedia explanation much more robust and illustrative. I hope this is where we can agree to disagree.

The reason why it only applies to the most significant digit is that I can say for certain that quantities of 1_ will appear ~2x as much as quantities in the 2_x family. However, I can't say whether numbers ending in 1 are more common than numbers ending in 6, because although 11 occurs more than any number higher than it, it makes up a minuscule proportion of the numbers ending in 1, and 16 occurs more than 21, 31, etc., so there's no clear way to predict what number will occur most in any digit but the most significant.

Thanks for offering your views. My analogy may be wrong or weak and maybe there is a better one to be found.

Every base is base 10. There's your answer.