|
|
|
|
|
by thisiszilff
789 days ago
|
|
Heh -- my explanation isn't the clearest I realize, but yes, it is BoW. Eg fix your vocab of 50k words (or whatever) and enumerate it. Then to make an embedding for some piece of text 1. initialize an all zero vector of size 50k
2. for each word in the text, add one to the index of the corresponding word (per our enumeration). If the word isn't in the 50k words in your vocabulary, then discard it
3. (optionally), normalize the embedding to 1 (though you don't really need this and can leave it off for the toy example).
initialize an embedding (for a single text) as an all zero vector of size 50k |
|