Hacker News new | ask | show | jobs
by minimaxir 530 days ago
The main reason the vector arithmetic of Word2Vec worked is due to how it was trained (directly training the network with a shallow network such that the entire knowledge for the model is contained within the embeddings). This is not the case with any modern embedding model.

At most with current models, you can average embeddings together.

1 comments

Ah, well good to know. I need to read more. Thanks for taking a look.

When you refer to averaging embeddings together, do you mean averaging a bunch of sentences/words for "male" to get a general concept vector or do you mean averaging two different words, like "royal" and "adult male", to get to the combined concept, say "king"?