|
|
|
|
|
by nxa
396 days ago
|
|
Thank you! I actually had a hard time finding prior work on this, so I appreciate the references. The dictionary is based on https://wordnet.princeton.edu/, no word2vec. It's just a plain lookup among precomputed embeddings (with mxbai-embed-large). And yes, I'm excluding words that are present in the query because. It would be interesting to see how other models perform. I tried one (forgot the name) that was focused on coding, and it didn't perform nearly as well (in terms of human joy from the results). |
|