Hacker News new | ask | show | jobs
by gattilorenz 1500 days ago
If I recall correctly, it's similar to how fasttext vectors work. For fasttext, this means that the representation of words is dependent to a certain extent to its morphemes (not really, but bear with me), so rare/inflected words can have a better representation due to the similarity with words that are similar-looking and more frequent (e.g. "unconstitutional" might never appear in the training data, but the system can approximate its meaning by composing that of "un", which it has seen in words such as "unbelievable", and the remaining subtokens, that come from the word "constitutional" that was present in the training set)

Not sure if the same thing happens here, tho