Thanks for linking your game! I love how fast it is. One thing I would like is the ability to work backwards from the end--on the challenge problem yesterday I got all the way from "artist" to "chess", only to find that neither "chess" nor "checkmate" nor any other chess-related word I could think of met the 30% threshold to get to "check". That was frustrating.
Maybe have a button to flip the direction of the word chain, so you could work from either end and meet in the middle somewhere.
Hey! I'm trying to come up with a similar daily puzzle game, did anything help you come up with the idea? Also, do you generate these unrelated words daily and then vet them before releasing the daily puzzle, or is it all pretty much automated?
I don't really remember how i've come with that idea :) Basically i've become obsessed with word games lately and made couple of them. I'm most proud and happy of those two:
As for Enlinko, it's hard game to balance. That's why i've made three difficulty levels. As for daily puzzles, i'm now using semi automated generator - it generates lot of pairs, try to solve them, check if solution counts match some rules and then i'm hand picking those potential pairs for daily puzzles.
I guess data used for making those vectors doesn't contain many occurrences of those two words in relation.
Anyway, that's downside of word vectors idea. There always will be some words which we human will consider more or less related than word vectors.
I've tried finding best one. It's different what Semantle uses (word2vec from Google) and different what Contexto uses (Glove). But still there are probably many word pairs which could match better.
That's some really interesting idea. But what if it will make too many "false positives"? Maybe too many word pairs will be considered more related that one could expect.
"Relatedness" here is according to ... something ... Approximately but not exactly the likelyhood that words appear near to one another in a large corpus of text. Probably doing lookups on something crunched by google books.
I like it a lot, but it’s also frustrating. Perhaps it’s being hugged to death right now but the lookups are very slow, so when I disagree with the results it’s a bit painful. If the results were quicker it would not be so bad, I could try different things.
I was a bit miffed that “currency” was not considered to be related to “mark”. Similarly I thought I’d found the perfect word between “ski” and “trust”, “mogul”, but once again your program disagreed.
Also, please help the player understand the basis of the word relations. I was surprised that the shortest path between “investor” and “mark” was a non-dictionary word: “zuckerberg”. Presumably you are not using WordNet but some corpus of embeddings. If you say where he corpus comes from I can tailor my guesses. Conversely though, the shortest path feature is good because it teaches me what works. Maybe a top 5 would be even better.
I saw "mark" and immediately though "scam" (which I figured would be easy to get to from "investor," but it told me that "scam" and "mark" share only 2% similarity.
I can't even go from "investor" to "money" (16%). I'm not sure how "Zuckerberg" is closer to "investor" than "money" is.
I had the exact same first guess. Easy, I figured - that's a perfect connection between them. It just seems like the logic dictating how "close" two words are is opaque and incomplete.
Exactly my feelings. The relatedness calculation needs to be an order of magnitude faster to make iterating on an idea fun.
After seeing the Zuckerberg path I tried Cuban, which is not related to Mark or investor despite Mark Cuban being far more famous as an investor than Mark Zuckerberg.
I agree. This would be a great usecase for the fastText.js library. It can calculate similarities of words based on embeddings in the browser - no need to wait for a slow php script.
interesting, I did that just now (with out having seen your comment) and got 13% and 33% (for future reference this comment is being made about an hour after parent)
I tried the same guess, and felt the same confusion. Whatever quality it is that the relatedness factor measures doesn't seem to align well with my sense of word association.
I was delighted to see this--thanks for posting it. Games like this (and Semantle, etc.) have a surprisingly long history. The TikTok #gotitchallenge [0] shows one way to play in person, also demonstrated by the vlogbrothers [1]. But there was also a 19th C. parlor game called "What is My Thought Like?" [2] in which players had to make semantic connections between two random words or phrases, and that is basically the same game as "Le Jeu de la pensée" [3][my English translation 4], ca. 1701, which is an extended version with additional random features players have to connect to a random word
Comparing the first example against a similar guess based on intuition:
zuckerberg => investor(21%), mark(20%)
cuban => investor(3%), mark(%4)
Using google as a general guide to how often these words appear together
mark cuban => About 40,500,000 results on google
"mark cuban" => About 13,200,000 results on google
"mark" "cuban" => About 33,500,000 results on google
investor cuban => About 80,800,000 results on google
"investor cuban" => About 945 results on google
"investor" "cuban" => About 9,810,000 results on google
mark zuckerberg => About 41,700,000 results on google
"mark zuckerberg" => About 29,400,000 results on google
"mark" "zuckerberg" => About 35,700,000 results on google
investor zuckerberg => About 11,100,000 results on google
"investor zuckerberg" => About 479 results on google
"investor" "zuckerberg" => About 3,160,000 results on google
Considering the above results of how often the base words appear together and the added knowledge that Mark Cuban is more recognized for his investment activity than Zuckerberg I wonder how the relational scores are calculated by the game.
(Note: I realize this is nit-picking in an extreme sense but I found myself very interested in the underlying tech behind the game and this was part of my exploration so I thought I would share it with everyone else. Feel free to tear apart my methods I am still very interested in how the OP coded their solution)
I suspect this is because "cuban" has a lot of meaning in other contexts as well. If you see "cuban" out of context, one may think of Cuba or even sandwiches before thinking about Mark Cuban or other investors.
I'm irritated to learn that proper nouns are allowed. That's unusual for word games, and imho breaks the spirit of the thing. But honestly most of the frustration is not knowing whether the game is going to treat two words as related enough in advance. It doesn't feel like I'm being clever, it feels like I'm blindly exploring a graph.
How is relatedness measured? Using some embedding space? I often disagree with the measurements, the worst one being "punch" and "bowl" only relating 12%.
The concept is very fun though. I might try to make my own version, as it also seems like a fun side project and a way to explore different word embedding spaces. Could be fun to maybe also have a visualization of the embedding space.
Per instructions, word similarities are computed using word vectors[1].
Note that the relatedness of words will depend on the training set. Many of these word2vec-based games uses data that was trained on Google News[2], so if "Unrelated Words" uses the same data, you should be looking for word pairs that are more common in news but perhaps less common in general text.
Semantle[3] is another game based on word vectors. I like "Unrelated Words" better because whereas Semantle requires guessing one fixed target word, which is often very different from its nearest neighbor, this game requires guessing a set of words, the flexibility of which makes it feel less frustrating.
I tried "money", because that seems pretty related to investor and also to mark (a monetary unit, not only German, but also an old English/Scottish equivalent of 13s 4p, if you can make sense of that), but the strength is only 16% and 2%, according to whatever embedding model they use.
I feel like the two are already equal in certain circles (eg the crypto space) so, as many comments are pointing out, understanding how relationships are built is important. That being said, if the black box is revealed, it’s not really a guessing game any more.
If I can make a UI suggestion, please consider making the word list fully visible and remove the overflow. It's fun to go the long way and see all my entries. I just completed the daily with 9 added words :)
In the instructions, I'd also make it a bit more clear what does it take to win the game. Like "Keep adding words until all words are similar by more than 20%"
Kudos for building the game, I hope it gains some traction.
It seems odd that waves could be less closely related to radio than air.
I wonder if there’s some homonym issue with wave or something like that?
Edit:
Similarly, I got a random puzzle:
Heat -> Bar
“Pressure” seams like it ought to be a good guess. But apparently Heat and pressure are 29% related (ok! Seems reasonable). But bar and pressure are only 7% related despite a bar being a unit of pressure.
The solution was to add kilobar between bar and pressure, which is fine I guess.
I bet it is a good implementation of word vectors. I just have no intuition for how word vectors work (does the presence of homonyms which are far from the pairing result in a less close pairing, for example?)
My word had a higher sum of percentages than the word listed as the "the best solution for this puzzle so far". I chose (SPOILER): radio ->(23%) sonar ->(30%) ocean, vs radio ->(29%) air ->(20%) ocean, so 53% vs 49%. Is the first connection more important than the subsequent ones?
I had to connect investor and mark... I put "grade". It said that it was not related to mark. I stopped at that point.
Cool concept, but seems like its not reliable/intuitive enough unfortunately. Keep at it and I might try again on a future version with a better back-end/tolerance.
I got the same, I put "scammer". Also thought it was unrelated, even though a scammer convinces a mark they're an investor. Tried "cryptocurrency", still unrelated, though a cryptocurrency investor is a mark!
I'm trying this for the first time. Today's puzzle is asking me to link "radio" and "ocean". I put "waves", which is obviously the best answer :-) and it scored only a 19% match. It's now asking for /more/ linking words?!
nice concept, but doesn't work as game yet, due to vast space of creative association between the words which are possible but not detected by app. Example:
banknote is not >20% related to and end word
investor (5%) mark (13%)
As others are mentioning, it would be nice to know what 'relatedness' means here, because a lot of words that seem like they'd be closely related are not, as calculated by the game
I'm author of https://enlinko.com/ game, published it 24 days ago:
https://news.ycombinator.com/item?id=35630451
Domain for this game has been created 9 days ago. So, i think someone was heavily inspired by my idea.
I understand that anyone can make game with same idea, but i'm bit sad that Enlinko haven't got such traction on HN as this game.