Hacker News new | ask | show | jobs
by James_Henry 1208 days ago
It's because it's not based on a simple measure of the similarity of meaning of the words. It's based on how often the words are used next to each other, which means that since celsius is used more often with the word temperature in the corpus, it will be closer to the word temperature.
1 comments

Try game #160 (25/02).

Input as guesses:

digit

number

car

compare

Once you got the solution, please explain to me how it works again.

I don't know what game #160 is supposed to teach me, lol. I do admit that my explanation isn't right. The game's measure of similarity is based on context in which words are used, not necessarily their meaning. Some people probably argue that context is what gives meaning to words. I think it's not just context.
What I meant was that the "nearness" is somehow "random", I don't doubt that the results come from a clever analysis of a zillion documents or websites, still the final result makes little sense.

Your explanation is likely to be accurate in the case of celsius, but it seemingly doesn't fit on this other game/answer.

I don't want to spoil the answer to that game, but to me it is hard to believe that both "car" and "compare" are nearer than "digit" or "number" to the answer if it is based on number of occurences in context.