Hacker News new | ask | show | jobs
by v1n337 3263 days ago
You're right. It's a shallow single network with the weight embeddings of words at the intermediate layer extracted as the vector-space representation of a word, depending on its context.