This can clearly be guessed from a search as well. Popularity can be well defined, and in the case of Obama, there is clearly one much more popular than the others.
The model still needs to infer from the sentence the entity to look it up. It is also the case that this is a relatively simple example as 'Obama' refers to a single class of entities and there is not a lot of ambiguity around resolution of class, only resolution of specific entity.
Take this sentence:
> When was KitKat released?
I could refer to the sweet, or the Android OS. Vastly different classes, and the model here needs to "decide" to ask for more information to disambiguate the class, and if the class is the sweet, then it needs to disambiguate the taste particular flavour possibly, and even ask the geographic location.
Yes, but the amount of knowledge necessary to decide how to make those sorts of decisions is far smaller than the amount of knowledge necessary to answer all such questions.
And that's perfectly fine. Humans have exactly the same problem. They will get this wrong, and you will reply "no, I'm talking about the android version". Language is ambiguous so we cannot expect machines to get it right all the time.
I do agree with you that it is fine, what I was getting at was that there needs to be a way to measure uncertainty in a manner that is robust to unbalanced distributions or context drifting.
Take this sentence:
> When was KitKat released?
I could refer to the sweet, or the Android OS. Vastly different classes, and the model here needs to "decide" to ask for more information to disambiguate the class, and if the class is the sweet, then it needs to disambiguate the taste particular flavour possibly, and even ask the geographic location.