# The two-daughter-problem [duplicate]

When hearing about the two-daughter problem, I first thought it to be quite clear (after, of course, at first falling into the trap like many of us), but on the second glance, I encountered some serious problems with my understanding.

The original problem seems to be quite easy: Assume that the only thing you know about a man with two kids is that at least one of the kids is a daughter. What is the probability that the other kid is a daughter as well? (Boys and girls are assumed to be born equally often.)

After the first impulse (“1/2 of course!”), it becomes clear that it is only 1/3. The problem can be mapped to a situation where from the multitude of families with two children, only those with M/M are ruled out, while the equally often cases F/F, F/M and M/F remain, making F/F only one third of all remaining cases.

But now, meet Mr. Smith. I don’t know much about him (except that he has two children), but when he approached me, he told me: “I am so happy! Victoria just got the scholarship she wanted!”

Now what is the probability that Victoria has a sister?

Since I only know that Mr. Smith has two children, and one is obviously a girl, I am tempted to map this onto the two-daughter-problem, leading to the answer “1/3”.

But wait! What if I ask Mr. Smith first, if Victoria is his elder daughter? Assume his answer is yes (and ignore any problems with twins – even then one is typically a few seconds “older” than the other). So now I know that from the cases (F/F, F/M, M/F), M/F also drops out. And now, the probability for F/F just rose to 1/2.

Okay, but what if his answer is no? Then Victoria is the younger one, and F/M drops out. Again, the probability rises to 1/2.

So I’m going to just ask him: “Well, Mr. Smith, is Victoria your elder daughter? Wait – don’t answer, because whatever you may answer, it doesn’t matter. The probability just rose from 1/3 to 1/2.”

Or, even better, I do not even have to ask him, just thinking about the question will shift probabilities to 1/2, which means that the original probability for Victoria having a sister must already have been 1/2. But then the mapping to the two-daughter-problem is obviously false.

Where is my error?

Making things worse, I could also create a setup where Mr. Smith just tells me: “I have two kids, and at least one of them is a girl.” I then ask him: “Oh, can you give me a name of a daughter of yours?” and he answers: “Sure. Victoria.”

(Side note: I have a gut feeling that this has something to do with how to assume probability distributions behind situations, similar to the Two envelopes problem, but I can’t figure this out completely yet.)

——– UPDATE ——–

It seems that my error is that the question “Is Victoria the older child?” does not change the probabilities. If I know for sure that Mr. Smith was picked from an equally distributed (M/F, F/M, F/F) sample, then the knowledge that Victoria is the older child does not change anything, as was pointed out here, and the probability for her having a sister is 1/3.

But it is very interesting that solely from the sentence “Victoria just got the scholarship she wanted!” I can NOT infer that Mr. Smith is indeed chosen from this uniform distribution.

Imagine that all kids have the same chance to get a scholarship, and the happy father will tell us if it is the case. Then it is actually twice as probable that Mr. Smith will tell us about his daughter’s success if he has two girls, so the weighting of the four possibilities (M/M, F/M, M/F, F/F) is (0, 1, 1, 2). And in this case, the probability of Victoria having a sister is 1/2.

So another problem in my reasoning is the mapping of Mr. Smith’s statement to the two-daughter-problem. Simply put, without knowing more about the circumstances that led to Mr. Smith telling me about Victoria, I simply can’t say if the probability is 1/3 or 1/2.

I think the confusion arises because the classical boy-girl problem is ambiguous:

‘You know that Mr.Smith has two kids, one of which is a girl. What is the chance she has a sister?’

The ambiguity here is that from this description, it is not clear how we came to know that ‘Mr.Smith has two kids, one of which is a daughter.’

Consider the following two scenarios:

Scenario 1:

You have never met Mr. Smith before, but one day you run into him in the store. He has a little girl with him, which he tells you is one of his two children.

Scenario 2:

You are a TV producer, and you decide to do a show on ‘what is it like to raise a daughter?’ and you put out a call for such parents to come on the show. Mr.Smith agrees to come on the show, and as you get talking he tells you that he has two children.

Now notice: the original description applies to both cases. That is, in both cases it is true that you know that ‘Mr.Smith has two children, one of which is a daughter’.

However, in scenario 1, the chance of Mr. Smith having two daughters is $\frac{1}{2}$, but in scenario 2 it is $\frac{1}{3}$. The difference is that in the first scenario one specific child has been identified as female (and thus the chance of having two daughters amounts to her sibling being female, which is $\frac{1}{2}$), while in the second scenario no specific child is identified, so we can’t talk about ‘her sibling’ anymore, and instead have to consider a conditional probability which turns out to be $\frac{1}{3}$.

Now, your original scenario, where you don’t know anything about Mr. Smith other than that he has two children, and then Mr.Smith says ‘I am so happy Victoria got a scholarship!’ is like scenario 1, not scenario 2. That is, unless Mr. smith has two daughters called Victoria (which is possible, but extremely unlikely, and if he did one would have expected him to say something like ‘my older Victoria’), with his statement Mr.Smith has singled out 1 of his two children, making it equivalent to scenario 1.

Indeed, I would bet that most real life cases where at some point it is true that ‘you know of some parent to have two children, one of which is a girl’ are logically isomorph to scenario 1, not scenario 2. That is, the classic two-girl problem is fun and all, but most of the time the description of the problem is ambiguous from the start, and if you are careful to phrase it in a way so that the answer is $\frac{1}{3}$, you will realize how uncommon it is for that kind of scenario to occur in real life. (Indeed, notice how I had to work pretty hard to come up with a real life scenario that is at least somewhat plausible).

Finally, all the variations of whether Victoria is the oldest, youngest, or whether you don’t even know her name (‘Mr. Smith tells you one his children got a scholarship to the All Girls Academy’) do not change any of the probabilities (as you argued correctly): in most real life scenarios, the way you come to know that ‘Mr.Smith has two children, one of which is a girl’ (and I would say that includes your original scenario) means that the chance of the other child being a girl is $\frac{1}{2}$, not $\frac{1}{3}$.

So, when at the end of you original post you ask “where is my error?” I would reply: your ‘error’ is that you assumed that the correct answer should be $\frac{1}{3}$, and that since your argument implied that is would be $\frac{1}{2}$, you concluded that there must have been an error in your reasoning. But, as it turns out, there wasn’t! For your scenario, the answer is indeed $\frac{1}{2}$, and not $\frac{1}{3}$. So your ‘error’ was to think that you had made an error!

Put a different way: you were temporarily blinded by the pure math ( and I say ‘temporarily’, because you ended up asking all the right citical questions, and later realized that the classic two-girl problem is ambiguous: good job!). But what I mean is: we have seen this two-girl problem so often, and we have been told that the solution is $\frac{1}{3}$ so many times, that you immediately assume that also in your descibed scenario that is the correct answer… When in fact that is not case because the initial assumptions are different: the classic problem assumes a Type 2 scenario, but the original scenario described in your post is a Type 1 scenario.

It’s just like the Monty Hall problem … We have seen it so often, that as soon as it ‘smells’ like the Monty Hall problem, we say ‘switch!’ … when in fact there are all kinds of subtle variants in which switching is not any better, and sometimes even worse!

Also take a look at the Monkey Business Illusion: we have see that video of the gorilla appearing in the middle of people passing a basketball so many times that we can now surprise people on the basis of that!