Hacker News new | ask | show | jobs
by neilk 1140 days ago
I like it a lot, but it’s also frustrating. Perhaps it’s being hugged to death right now but the lookups are very slow, so when I disagree with the results it’s a bit painful. If the results were quicker it would not be so bad, I could try different things.

I was a bit miffed that “currency” was not considered to be related to “mark”. Similarly I thought I’d found the perfect word between “ski” and “trust”, “mogul”, but once again your program disagreed.

Also, please help the player understand the basis of the word relations. I was surprised that the shortest path between “investor” and “mark” was a non-dictionary word: “zuckerberg”. Presumably you are not using WordNet but some corpus of embeddings. If you say where he corpus comes from I can tailor my guesses. Conversely though, the shortest path feature is good because it teaches me what works. Maybe a top 5 would be even better.

You’re onto something though, keep at it!

6 comments

I saw "mark" and immediately though "scam" (which I figured would be easy to get to from "investor," but it told me that "scam" and "mark" share only 2% similarity.

I can't even go from "investor" to "money" (16%). I'm not sure how "Zuckerberg" is closer to "investor" than "money" is.

I had the exact same first guess. Easy, I figured - that's a perfect connection between them. It just seems like the logic dictating how "close" two words are is opaque and incomplete.
I thought Deutsche Mark..
Exactly my feelings. The relatedness calculation needs to be an order of magnitude faster to make iterating on an idea fun.

After seeing the Zuckerberg path I tried Cuban, which is not related to Mark or investor despite Mark Cuban being far more famous as an investor than Mark Zuckerberg.

"Cuban" was my first try. I'm surprised that "Zuckerberg" works so well given how poorly "Cuban" performs.
Zuckerberg almost always refers to one specific person; Cuban is a last name of an investor and also describes things related to the nation of Cuba.
I agree. This would be a great usecase for the fastText.js library. It can calculate similarities of words based on embeddings in the browser - no need to wait for a slow php script.
I felt like "market" should have got better than 6% related score - investors participate in stock markets, and they often mark-to-market.
Agreed--I think it's overloaded at the moment. Not clear how it's supposed to work since the responses are so delayed. Looks fun, though!
I thought investor -> currency -> mark would be a slam dunk but apparently "mark" here is not related to the German Mark.