| ok ok... let me try to get this straight. Just as kind of a mental process for trying to understand whether or not something passes the smell test, I typically try to take the basic premise and turn it up to 11 and see if that still makes sense. In this problem, as you've described it, we're enumerating "ways to assign gender and birth-day-of-week." We can do this because there are a countable number of "days of the week" (so we can map to the integers: 1-6) AND there is also a surjective function of [child] -> [day of the week they were born]. Am I right so far? Now let's replace the set [1-6] with another countable set that also maintains the surjective function. We could say "day in the lunar cycle" (so ~27 options), or better "day of the year" (366 options), for example. Do we now need to consider the 23662366 ways to assign gender and birth-day-of-the-year? Take it further with whatever you want: "birth weight in milligrams" or "number of freckles" (as I previously suggested). All countable things that meet the surjective requirement. This is starting to smell funny, right? So let's take a look at the math. You say there are 2727 ways to configure day+gender, assuming independence for kid 1 (k1) and kid 2 (k2). This represents: (k1 gender options * k1 day of week options) * (k2 gender options * k2 day of week options). Right? I'm with you so far. Then you say "Of these possibilities, 27 are situations where one kid is a Tuesday boy." Hold up. We are given two pieces of information: that one of the kids is a boy, and that particular boy was born on a Tuesday. Let's say the boy is k1 (this is an assignment of enumeration, not of "who came first;" just like Sunday = 1 does not mean that any kid born on a Sunday was born before every kid born on Monday = 2). So now the k1 options are [11] (boy, tuesday), and the total number of options are: [11] * [27] = 14. Of those 14, 7 are girl options. And we're back to a straight 50%. So yes, I dispute the 27 number. It seems like it is arrived at by 2127, minus one for an apparent duplicate. But the 212*7 represents maintaining gender non-specificity for Tuesday boy, which should be incorrect, no? > You have stated by fiat that certain things are irrelevant to certain other things... Yes, but that's what "independent" means, right? You also stated that you're assuming these two things are independent, hence equiprobability. But independence is defined by P(A) = P(A|B). The probability of A is completely unaffected by B. Yet the outcome you arrive at is that P(A) IS affected by B, so the math presented is internally inconsistent. What am I missing here? I'm fascinated by the uncertainty around this little problem. |
problem 1) You go up to a person and ask them if they have exactly 2 children, at least one of which is a boy born on Tuesday. They say yes. What is the probability that they have a girl?
problem 2) You go up to a person and ask them if they have exactly 2 children, at least one of which is a boy. They say yes. You then ask them which day of the week a boy they have was born on. They say Tuesday. What is the probability that they have a girl?
The original problem that was posed is equivalent to problem 1, but not equivalent to problem 2. This could be what is confusing you, because in problem 2 the extra information plays no role in the selection process, while it does play a role in problem 1. In problem 2, the answer is the standard 2/3. Why are the probabilities different between problem 1 and 2? Here's why:
Think about the set of people who could answer yes to the question in problem 2. The ratio of these groups is important. A parent with BB (two boys) is equally likely to answer yes to problem 2 (100% likely to be exact) as a parent with BG and GB (also 100% likely to answer yes), which leads to the correct solution of 2/3. However, in problem 1 a parent with BB is NOT EQUALLY LIKELY to answer yes as a parent with BG. This is because we added an extra qualifier (must be born on Tuesday). The parent with BB has two chances to meet this qualifier because they have two boys, so the parent with BB is actually more likely to answer yes to the question than the parent with BG. As the qualifier becomes more and more rare (day of lunar cycle), the probability of the BB parent answer yes P(yes|BB) approaches twice the value of P(yes|BG). So now you're left with some subset of parents with BB, BG, and GB, but in this scenario you've sampled from BB approximately twice as much as you've sampled from each of the BG and GB groups, leaving you with approximately the same number of people from group BB as the combined amount from groups BG and GB. This is why the probability approaches 50%
I spend a while writing this, so hopefully it helps!