Hacker News new | ask | show | jobs
by epiccoleman 890 days ago
This is pretty interesting. Intuitively, Connections is the kind of thing I would expect GPT to not be good at, because almost every day there's something that feels kind of "out of left field" in the categories. In my experience LLMs are good at regurgitating the "standard" take on a topic, or "best practices", but lack the creativity and out-of-the-box thinking that makes Connections fun.

On the other hand, it feels like the kind of thing where an LLM might be surprisingly good, because it could, in theory, be able to see more correlations than a human can. Based on these results I guess my intuition seems to hold up.

I wonder if a better / different way to approach this could be more "algorithmically" - maybe have the LLM generate a list of possible categories for each individual word and then try to operate on those associations?

Cool article!

1 comments

The "whole point" of embeddings is that words have a vector that represents how well that word fits into a certain categories, so words belonging together is close in that vector space. So in that sense it almost feels like this should be solvable using something simpler than a full LLM. To "just" get the embeddings of the words, and then find the groups of 4 that minimizes the total distances within the groups.
The problem is Connections is designed to use a tons of alternate definitions and other vaguities that aren’t well modeled in typical embeddings. Today’s for instance (spoilers!!) has Coat, Green, Pod, and Soup as being linked for them matching “Pea ___”. No embedding would relate them at all, unless that suffix is known a priori.