Hacker News new | ask | show | jobs
by bayonetz 2415 days ago
Neat! eggie5 and friends, I did something related for Caviar that you might be interested in: https://developer.squareup.com/blog/caviars-word2vec-tagging...
1 comments

Although your post is orthogonal to what we present in this work, I think it still merits discussion.

The high-level concept of a product is important for any commerce company w/ an unbounded and unstructured product catalog. In this case, the concept of a Dish is a very valuable primitive for Caviar and GH to understand.

You propose an interesting technique for dish tagging:

1. train word2vec on menu item text 2. Generate menu item embeddings by aggregating respective word vectors 3. Curate dish (label) set 4. Generate embeddings for dish labels by aggregating word vectors 5. Tag menu items w/ their dish label by NN search.

Then once you have this high-level concept of dish you can drive all kinds of interesting product innovations:

* trending asian noodle dishes * tacos in san Diego * It also helps w/ the item sparsity problem in matrix factorization

Do I understand?

I'm going to try this on our next learning Friday.

We have taken a different route for dish tagging. First we have a graphical model tag some items w/ low coverage but high precision and then we feed that into a classifier to generalize and provider full coverage. It's like a generative model fed into a discriminative model.

>> items w/ low coverage but high precision

Newbie NLP dev here. What's this concept of coverage vs precision that you speak of? It sounds intriguing.

It's the ratio of false positives and false negatives to their true equivalents.