|
|
|
|
|
by jeromebaek
2698 days ago
|
|
Interesting paper. I'd like to know how this compares with even more naive methods like simple summation. If this method is an application of Cover's theorem it should handily beat summation or any other simple method that places the sentence embedding in the same dimension as the word embeddings. |
|
The nowadays surprisingly poor performance of the models in Hill et al. (2016) can at least partly be explained because 1) they use poorer (older) word embeddings; and 2) FastSent sentence representations are of the same dimensionality as the input word embeddings, while they are compared in the same table to much higher-dimensional representations.
See also figure 1 for the increase in performance across tasks when the embedding dimension is increased.