I haven't anything with fastText, but I have with word2vec. It embeds each word in a 300 dimensional vector, such that similar words have a large cosine similarity. (If you normalize each vector to have a unit norm, then cosine similarity is just a dot product.) So in short, it gives you a measure of how similar each word is to other words.
This has many uses in machine learning. You can extend it to documents and find similar documents, find misspellings, use them as features in a ML model, etc.
There haven't been good vectors in that many languages (that I know of), so that's a plus for these fastText vectors.
This has many uses in machine learning. You can extend it to documents and find similar documents, find misspellings, use them as features in a ML model, etc.
There haven't been good vectors in that many languages (that I know of), so that's a plus for these fastText vectors.