|
|
|
|
|
by fromthestart
2476 days ago
|
|
I'm afraid you misunderstand the way embeddings work - at least for BERT based models, which are currently state of the art. BERT embeddings, after training change with context. In other words if you feed a paragraph about bank robbers and look at the encoding for bank, it will be meaningfully different from the encoding for the same word produced from a paragraph (or sentence) about river banks. We use BERT at the startup I work at, and one of our tests was the sentence "the bank robbers robbed the bank and then rested by the river bank". BERT was able to generate three different semantically meaningful encodings for the word bank in this sentence. The first two instances were much closer to each other in vector space (euclidean distance) than the last. This is huge, because it is arguably the first step in building an AI which can perform basic reasoning about information encoded in text. For example, if you average up the encodings of a paragraph of words, you can create an "encoding" which assigns a summary meaning or topic. Simple vector math becomes a powerful reasoning tool. The future is here. |
|
Well, except for the many many decades of previous work on NLP using symbolic methods that are quite capable. Although DNNs are en vogue and have some amazing properties, we shouldn't forget that symbolic AI/NLP using explicitly semantic representations is powerful and has a rich history, and complements DNNs quite well -- such as being easily explainable, for one.