Unrelated Words Puzzle

https://news.ycombinator.com/item?id=35630451

Hello,

I'm author of https://enlinko.com/ game, published it 24 days ago:

Domain for this game has been created 9 days ago. So, i think someone was heavily inspired by my idea.

I understand that anyone can make game with same idea, but i'm bit sad that Enlinko haven't got such traction on HN as this game.

CrazyStat 1138 days ago

Thanks for linking your game! I love how fast it is. One thing I would like is the ability to work backwards from the end--on the challenge problem yesterday I got all the way from "artist" to "chess", only to find that neither "chess" nor "checkmate" nor any other chess-related word I could think of met the 30% threshold to get to "check". That was frustrating.

Maybe have a button to flip the direction of the word chain, so you could work from either end and meet in the middle somewhere.

hershyb_ 1139 days ago

Hey! I'm trying to come up with a similar daily puzzle game, did anything help you come up with the idea? Also, do you generate these unrelated words daily and then vet them before releasing the daily puzzle, or is it all pretty much automated?

https://pixletters.com https://betweenle.com

I don't really remember how i've come with that idea :) Basically i've become obsessed with word games lately and made couple of them. I'm most proud and happy of those two:

As for Enlinko, it's hard game to balance. That's why i've made three difficulty levels. As for daily puzzles, i'm now using semi automated generator - it generates lot of pairs, try to solve them, check if solution counts match some rules and then i'm hand picking those potential pairs for daily puzzles.

zeven7 1139 days ago

Great idea for a game!

A question I ran into while playing your game is why it says "Amazon" and "Prime" are only 3% related? That seems very surprising.

https://github.com/commonsense/conceptnet-numberbatch

I'm using those vectors, which latest version is from 2019:

I guess data used for making those vectors doesn't contain many occurrences of those two words in relation.

Anyway, that's downside of word vectors idea. There always will be some words which we human will consider more or less related than word vectors.

I've tried finding best one. It's different what Semantle uses (word2vec from Google) and different what Contexto uses (Glove). But still there are probably many word pairs which could match better.

slig 1139 days ago

What about using the those three models and returning the best score between them?

That's some really interesting idea. But what if it will make too many "false positives"? Maybe too many word pairs will be considered more related that one could expect.

hgsgm 1139 days ago

Enlinko is a dead end because it only allows 1 5-second game per day.

Also I don't see source code.

NiloCK 1139 days ago

See also https://enlinko.com/. The calculations here are snappy.

I do not know which one was created first.

"Relatedness" here is according to ... something ... Approximately but not exactly the likelyhood that words appear near to one another in a large corpus of text. Probably doing lookups on something crunched by google books.

https://news.ycombinator.com/item?id=35630451

Hello,

Thanks for posting about Enlinko. I'm author of it, i've published it 24 days ago:

Domain for this game has been created 9 days ago. So, i think someone was heavily inspired by my idea.

I understand that anyone can make game with same idea, but i'm bit sad that Enlinko haven't got such traction on HN as this game.

As for relatedness, my game uses semantic vectors from this model https://github.com/commonsense/conceptnet-numberbatch

this is really nice!

I like it a lot, but it’s also frustrating. Perhaps it’s being hugged to death right now but the lookups are very slow, so when I disagree with the results it’s a bit painful. If the results were quicker it would not be so bad, I could try different things.

I was a bit miffed that “currency” was not considered to be related to “mark”. Similarly I thought I’d found the perfect word between “ski” and “trust”, “mogul”, but once again your program disagreed.

Also, please help the player understand the basis of the word relations. I was surprised that the shortest path between “investor” and “mark” was a non-dictionary word: “zuckerberg”. Presumably you are not using WordNet but some corpus of embeddings. If you say where he corpus comes from I can tailor my guesses. Conversely though, the shortest path feature is good because it teaches me what works. Maybe a top 5 would be even better.

You’re onto something though, keep at it!

SamBam 1139 days ago

I saw "mark" and immediately though "scam" (which I figured would be easy to get to from "investor," but it told me that "scam" and "mark" share only 2% similarity.

I can't even go from "investor" to "money" (16%). I'm not sure how "Zuckerberg" is closer to "investor" than "money" is.

kimbernator 1139 days ago

I had the exact same first guess. Easy, I figured - that's a perfect connection between them. It just seems like the logic dictating how "close" two words are is opaque and incomplete.

johtso 1139 days ago

I thought Deutsche Mark..

CrazyStat 1139 days ago

Exactly my feelings. The relatedness calculation needs to be an order of magnitude faster to make iterating on an idea fun.

After seeing the Zuckerberg path I tried Cuban, which is not related to Mark or investor despite Mark Cuban being far more famous as an investor than Mark Zuckerberg.

nilstycho 1139 days ago

"Cuban" was my first try. I'm surprised that "Zuckerberg" works so well given how poorly "Cuban" performs.

pflats 1139 days ago

Zuckerberg almost always refers to one specific person; Cuban is a last name of an investor and also describes things related to the nation of Cuba.

tomthe 1139 days ago

I agree. This would be a great usecase for the fastText.js library. It can calculate similarities of words based on embeddings in the browser - no need to wait for a slow php script.

sparsely 1140 days ago

I felt like "market" should have got better than 6% related score - investors participate in stock markets, and they often mark-to-market.

obituary_latte 1140 days ago

Agreed--I think it's overloaded at the moment. Not clear how it's supposed to work since the responses are so delayed. Looks fun, though!

albrewer 1139 days ago

I thought investor -> currency -> mark would be a slam dunk but apparently "mark" here is not related to the German Mark.

xyztimm 1139 days ago

I put waves for radio and ocean and got 19% and 55%.

Is my expectation that the first percentage should be higher off?

A_D_E_P_T 1139 days ago

"Investment" --> "Capital"

Check. That worked.

"Capital" --> "Letter"

Did not work at all. And yet the two words are side-by-side with extreme frequency.

So, basically, I don't know how this game gauges relatedness. I do know that I don't like it.

dangond 1139 days ago

Semantic similarity means similarity of meaning, not how frequently they appear together.

Try out semantle to get a better sense of it if it's not immediately intuitive what I mean by that.

Avshalom 1139 days ago

interesting, I did that just now (with out having seen your comment) and got 13% and 33% (for future reference this comment is being made about an hour after parent)

Edit: oh wait I nevermind wave/waves

zeven7 1139 days ago

Did you really? I just did it just now (9 minutes after your comment) and got 19% and 55%.

zoogeny 1139 days ago

I literally just did the exact same. I'm totally unmotivated by this game if this kind of connection isn't what it is looking for.

I also did waves. I needed amplitude to bridge the gap to radio.

I also had a random one:

Heat -> bar

I went with pressure, since it is related to heat obviously, and a bar is a unit of pressure. But it didn’t like the second one.

I wonder if there’s a homonym issue, or if I just don’t understand word embeddings.

marssaxman 1139 days ago

I tried the same guess, and felt the same confusion. Whatever quality it is that the relatedness factor measures doesn't seem to align well with my sense of word association.

[0] https://www.tiktok.com/tag/gotitchallenge

WobbuPalooza 1139 days ago

I was delighted to see this--thanks for posting it. Games like this (and Semantle, etc.) have a surprisingly long history. The TikTok #gotitchallenge [0] shows one way to play in person, also demonstrated by the vlogbrothers [1]. But there was also a 19th C. parlor game called "What is My Thought Like?" [2] in which players had to make semantic connections between two random words or phrases, and that is basically the same game as "Le Jeu de la pensée" [3][my English translation 4], ca. 1701, which is an extended version with additional random features players have to connect to a random word

[1] https://www.youtube.com/watch?v=kyx8iMKYrE8

[2] https://www.google.com/books/edition/American_Girl_s_Book/WO...

[3] https://www.google.com/books/edition/Les_jeux_d_esprit_ou_La...

[4] https://wobbupalooza.neocities.org/1701#tr_60

duckqlz 1139 days ago

Comparing the first example against a similar guess based on intuition:

zuckerberg => investor(21%), mark(20%)

cuban => investor(3%), mark(%4)

Using google as a general guide to how often these words appear together

mark cuban => About 40,500,000 results on google

"mark cuban" => About 13,200,000 results on google

"mark" "cuban" => About 33,500,000 results on google

investor cuban => About 80,800,000 results on google

"investor cuban" => About 945 results on google

"investor" "cuban" => About 9,810,000 results on google

mark zuckerberg => About 41,700,000 results on google

"mark zuckerberg" => About 29,400,000 results on google

"mark" "zuckerberg" => About 35,700,000 results on google

investor zuckerberg => About 11,100,000 results on google

"investor zuckerberg" => About 479 results on google

"investor" "zuckerberg" => About 3,160,000 results on google

Considering the above results of how often the base words appear together and the added knowledge that Mark Cuban is more recognized for his investment activity than Zuckerberg I wonder how the relational scores are calculated by the game.

(Note: I realize this is nit-picking in an extreme sense but I found myself very interested in the underlying tech behind the game and this was part of my exploration so I thought I would share it with everyone else. Feel free to tear apart my methods I am still very interested in how the OP coded their solution)

wakamoleguy 1139 days ago

I suspect this is because "cuban" has a lot of meaning in other contexts as well. If you see "cuban" out of context, one may think of Cuba or even sandwiches before thinking about Mark Cuban or other investors.

zeta0134 1139 days ago

I'm irritated to learn that proper nouns are allowed. That's unusual for word games, and imho breaks the spirit of the thing. But honestly most of the frustration is not knowing whether the game is going to treat two words as related enough in advance. It doesn't feel like I'm being clever, it feels like I'm blindly exploring a graph.

kaesve 1140 days ago

How is relatedness measured? Using some embedding space? I often disagree with the measurements, the worst one being "punch" and "bowl" only relating 12%.

The concept is very fun though. I might try to make my own version, as it also seems like a fun side project and a way to explore different word embedding spaces. Could be fun to maybe also have a visualization of the embedding space.

[1] https://en.wikipedia.org/wiki/Word_embedding

omoikane 1139 days ago

Per instructions, word similarities are computed using word vectors[1].

Note that the relatedness of words will depend on the training set. Many of these word2vec-based games uses data that was trained on Google News[2], so if "Unrelated Words" uses the same data, you should be looking for word pairs that are more common in news but perhaps less common in general text.

Semantle[3] is another game based on word vectors. I like "Unrelated Words" better because whereas Semantle requires guessing one fixed target word, which is often very different from its nearest neighbor, this game requires guessing a set of words, the flexibility of which makes it feel less frustrating.

[2] https://code.google.com/archive/p/word2vec/

[3] https://news.ycombinator.com/item?id=31588388

kimbernator 1139 days ago

Apparently "invest" is only 6% related to "investor"?

I think the logic here needs some work, very cool idea though.

webstrand 1140 days ago

I really don't get this. Swindler has 0% relatedness to mark?

tgv 1139 days ago

I tried "money", because that seems pretty related to investor and also to mark (a monetary unit, not only German, but also an old English/Scottish equivalent of 13s 4p, if you can make sense of that), but the strength is only 16% and 2%, according to whatever embedding model they use.

Lewton 1140 days ago

similarly "scam" is only 2% related to mark

I'd expect scam to be squarely in the middle between investor and mark :D

NeoTar 1140 days ago

I expected Germany to link to 'mark' (i.e. the former currency), but apparently not.

grammarxcore 1139 days ago

I feel like the two are already equal in certain circles (eg the crypto space) so, as many comments are pointing out, understanding how relationships are built is important. That being said, if the black box is revealed, it’s not really a guessing game any more.

taneq 1140 days ago

"scam" was my first try, too. I would have thought it would be strongly related to both "investor" and "mark"?

0xdada 1140 days ago

also, bank and note are only 15% related

tiborsaas 1139 days ago

If I can make a UI suggestion, please consider making the word list fully visible and remove the overflow. It's fun to go the long way and see all my entries. I just completed the daily with 9 added words :)

In the instructions, I'd also make it a bit more clear what does it take to win the game. Like "Keep adding words until all words are similar by more than 20%"

Kudos for building the game, I hope it gains some traction.

I got ocean->radio

My path:

Ocean-> waves -> amplitude -> radio

Solution path:

Ocean -> air —> radio

It seems odd that waves could be less closely related to radio than air.

I wonder if there’s some homonym issue with wave or something like that?

Edit:

Similarly, I got a random puzzle:

Heat -> Bar

“Pressure” seams like it ought to be a good guess. But apparently Heat and pressure are 29% related (ok! Seems reasonable). But bar and pressure are only 7% related despite a bar being a unit of pressure.

The solution was to add kilobar between bar and pressure, which is fine I guess.

harshalizee 1139 days ago

I got the same ocean->radio How is 'waves' not a qualifying word?

I don’t know… in particular, I don’t really have a good intuition for word embedding closeness at all.

tartrate 1139 days ago

Same here. I just concluded that this is probably not that well implemented.

I bet it is a good implementation of word vectors. I just have no intuition for how word vectors work (does the presence of homonyms which are far from the pairing result in a less close pairing, for example?)

thih9 1139 days ago

My word had a higher sum of percentages than the word listed as the "the best solution for this puzzle so far". I chose (SPOILER): radio ->(23%) sonar ->(30%) ocean, vs radio ->(29%) air ->(20%) ocean, so 53% vs 49%. Is the first connection more important than the subsequent ones?

jfk13 1139 days ago

I was disappointed that "waves" didn't (quite) work.

thih9 1138 days ago

Same!

tiledjinn 1139 days ago

i got:

investor mark

put in 'check' which is apparently related 0% to investor. clearly the author has bad experiences in seeking funds

adverbly 1139 days ago

I had to connect investor and mark... I put "grade". It said that it was not related to mark. I stopped at that point.

Cool concept, but seems like its not reliable/intuitive enough unfortunately. Keep at it and I might try again on a future version with a better back-end/tolerance.

SAI_Peregrinus 1139 days ago

I got the same, I put "scammer". Also thought it was unrelated, even though a scammer convinces a mark they're an investor. Tried "cryptocurrency", still unrelated, though a cryptocurrency investor is a mark!

31337Logic 1139 days ago

I'm trying this for the first time. Today's puzzle is asking me to link "radio" and "ocean". I put "waves", which is obviously the best answer :-) and it scored only a 19% match. It's now asking for /more/ linking words?!

Uhmmmm... no.

k_ 1139 days ago

Same here, was puzzled too and had to add "wavelength" to complete today's words.

ArekDymalski 1139 days ago

nice concept, but doesn't work as game yet, due to vast space of creative association between the words which are possible but not detected by app. Example:

banknote is not >20% related to and end word investor (5%) mark (13%)

okl 1140 days ago

Very fun puzzle for practicing your vocabulary. Hope it gets more upvotes.

megmogandog 1139 days ago

As others are mentioning, it would be nice to know what 'relatedness' means here, because a lot of words that seem like they'd be closely related are not, as calculated by the game

glitcher 1139 days ago

Very fun, just a bit slow right now. My solution was far from optimal, but I eventually connected them with:

investor > banker > robber > accomplice > patsy > mark

ouid 1139 days ago

the answer to radio-ocean is obviously wave. But wave was only 14% related to radio, which is wrong enough for me to call it a bug

SilasX 1139 days ago

Wow. I like this much better than Semantle, which seemed to be really bizarre about what was related.

e-gn 1140 days ago

I’m already addicted to it.

woliveirajr 1140 days ago

What makes a word close to another ? So that it gets above the 20% mark?