|
|
|
|
|
by grkvlt
3304 days ago
|
|
> Not a very common first or last name in themselves, but in combination quite a common match I'm sort of confused. You state that your first name and last name are not common. That is, there is a low probability some random person has the same first name as you, and also a low probability some random person has the same last name as you. We can write that as follows: P(N_f = F, N_l = *) < X
P(N_f = *, N_l = L) < X
Where X is a low probability, and F and L are your first and last names while N_f and N_l are the first and last names of some random person. So, how can the following hold: P(N_f = F, N_l = L) > X
That is, the probability some random person has both your first name and your last name is higher, or in your words the occurrence is 'quite common.'It should be obvious this is not possible. |
|
For example, meet a 'Jones'. There's a good probability that his first names is 'Thomas'.
Meet a 'Thomas'. There's a low probability his last name is 'Jones'.
That's because 'Jones' is predominantly Welsh in origin, and 'Thomas' is a pretty liked given name in Wales. This of course would be more true in Wales than people of Welsh ancestory living far from Wales. And perhaps despite and because of the great singer Tom Jones, this combination may have fallen over the past few decades.
However, my family name is pretty location-specific in the UK, and even the diaspora of the name that went to places like North America tended to keep up traditional, albeit 2-3 centuries later.
Another example of non-independence would be a name like 'Ahmed' as given name and 'Zhang' as family name. 'Zhang' is an extremely common family name on a global scale, as is 'Ahmed' as a given name. However the possibility of 'Ahmed' and 'Zhang' overlapping as a combination is slim. Perhaps it could happen in Singapore or Malaysia, but then even 'Zhang' is probably converted to a Hokkian/Hakka/Cantonese equivalent spelling, which is not 'Zhang'. Given the scale of these names, I'm sure there 'Ahmed Zhang's knocking around, but probably not that many.
The great thing about statistics is it is about discovery, not assumptions.
And assuming everything is nice easy math, independent, or stochastic, is one of the greatest mistakes we can all make when looking at numbers.