Hacker News new | ask | show | jobs
by amorroxic 2975 days ago
Sentence similarity were my explorations with WMD too, reached a setup in Keras with a siamese configuration, Wasserstein + KL loss (have a known vocabulary and feeding both word vector sequences as well as their LDA distributions as input). Post training cosine distance between encodings of such sequences look pretty decent - with one issue I've spotted though: WMD really seems to like about the same number of valid tokens in both sentences which is not how real world looks like - eager to see results of EM distance between image feature vectors, cheers.