|
>> When you summarize language in a similar way, you essentially produce multidimensional maps of the distances, based on common usage, between one word and every single other word in the language. The problem with word embeddings, or any distance-based model really, is that language doesn't work that way. Chomsky has a standard example he uses to make this point: "Instinctively, Eagles that fly swim". He points out that in this phrase, the "instinctively" goes with "to swim" (as in "instinctively, they swim") even though the phrase, and the attachement, mean nothing (the phrase is nonsensical by design). If the relation was really based on distance, we would expect "instinctively" to attach to "fly". The fact that it doesn't suggests that there is something else that makes us pick the correct association out of all the possible interpretations in that sentence. Word vectors in their original form also have trouble with homonyms etc "faux amies": for instance, the word "cat"- is it referring to the animal, or to the Linux command? In vector space, there wouldn't be any difference, so the animal would be associated with the symbol ">" and the Linux command with "small" and "furry". |