|
|
|
|
|
by joosters
3692 days ago
|
|
I don't see how a linguistic parser can cope with all the ambiguities in human speech or writing. It's more than a problem of semantics, you also have to know things about the world in which we live in order to make sense of which syntactic structure is correct. e.g. take a sentence like "The cat sat on the rug. It meowed." Did the cat meow, or did the rug meow? You can't determine that by semantics, you have to know that cats meow and rugs don't. So to parse language well, you need to know an awful lot about the real world. Simply training your parser on lots of text and throwing neural nets at the code isn't going to fix this problem. |
|
In terms of a basic probabilistic model, P(meow | rug) would be far lower than P(meow | cat), and that alone would be enough to influence the parser to make the correct decision. Now, if the sentence were "The cat sat on the rug. It was furry", that would be more ambiguous, just like it is for an actual human to decode. But models trained on real-world data do learn about the world.