What a crossword AI reveals about humans' way with words

Y	Hacker News new \| ask \| show \| jobs

	What a crossword AI reveals about humans' way with words (wired.com)
	70 points by gsjbjt 1724 days ago

8 comments

bvm 1724 days ago

I would be interested to see how they perform on UK-style cryptic crosswords (https://en.wikipedia.org/wiki/Cryptic_crossword) where there's much more play/subversion/meta-approach to the "rules".

link

sweezyjeezy 1724 days ago

I think it's pretty unlikely you'll be able to learn the UK rules to a human level using brute force ML - take a fairly easy clue like ‘Drunken men are more despicable' - you need to split this as (wordplay = drunken (anagram of) "men are") / (meaning = "more despicable") = "MEANER"

This is really hard for supervised learning - the reward is quite sparse (e.g. did you get it right / how many characters did you get right) but the task is reasonably complex (e.g. it would have to learn how to spot/execute arbitrary length anagrams on its own, already something that is nontrivial for ML). Sparse + complex usually means gradient descent will fail or converge to a more trivial minima e.g. only look for synonyms in the clue.

I reckon you would have to codify the different types of cryptic clues manually for this work well.

link

ximeng 1724 days ago

https://www.crosswordgenius.com/ not sure how this is done but it looks pretty good.

https://unlikely.ai/cryptic-crossword-genius-unlikely-ai-art... has a little more detail.

https://crosswordgenius.com/clue/who-may-get-drunk-with-real... example solution with anagram.

link

sweezyjeezy 1724 days ago

Very cool - the fact that it explains the solution makes me almost certain they have codified the rules here.

link

SamBam 1724 days ago

Actually, one of the interesting things about cryptic crosswords is that, in some ways, the rules are far more codified.

The specifics definitely depend on the paper, of course, but basics like the fact that the majority of the clues are actually two halves, clueing the word two different ways, and that anagrams are always explicitly clues using words such as "scrambled," "drunken," "off" or whatever could actually make it easier for the human to guide the ML.

link

bvm 1724 days ago

well yes, but the beautiful thing about UK cryptics at least is that they regularly intentionally misdirect the user, so you spend some time going down a rabbit hole and realise that you're on a hiding to nothing. I think that kind of game play is what makes a rules-based approach more challenging. I'll have to check out Crossword Genius though.

link

gorgoiler 1724 days ago

Crossword Genius is from one of the Amazon Alexa founders.

link

jameshart 1724 days ago

You might not need to codify the rules; you could create a tagged training set which includes additional information, like a ‘parts of speech’ breakdown of how a clue relates to an answer (anagramSignifier - anagramMaterial - association, etc.)

An unsupervised learner could probably do reasonably well at picking up on those patterns, even to the point of considering that a word it hasn’t seen used as an anagram signifier before might be doing that job in a particular clue.

On the other hand, machine translation learning has moved away from using tagged parts of speech as far as I’m aware, and it has nonetheless managed to develop sufficient internal modeling that it is as if it has learned parts-of-speech tagging; it’s possible that an ML on a cryptic clue corpus could develop those same hidden models.

link

sweezyjeezy 1724 days ago

That's still codifying the rules - doing it through the data rather than in preprocessing

link

r_c_a_d 1724 days ago

Yes, and fewer crossing letters, so there are usually many more words that will "fit".

link

weeblewobble 1724 days ago

A few years back I wrote a program that, given a blank or partially-filled in (NYT-style) crossword grid (with black squares already inserted), could fill out the rest of the grid with valid words/phrases both across and down. It had no relation to clues though, you had to write the clues yourself after the grid was filled out.

I wrote it because I wanted to make my dad (a huge nyt crossword fan) a custom crossword for his birthday. I put in a bunch of phrases related to him and our family and let the program fill in the rest. It was a huge hit, never really went back to it though. Anyone know if anything else like this exists?

link

pessimizer 1724 days ago

Don't use that as a final product, use it as a crossword book generator. Selling the books would be a lot more lucrative (or selling the actual puzzles to someone who already prints crosswords.)

Especially if you can generate them from arbitrary lists of of clues:words.

link

xkeysc0re 1724 days ago

There is Phil, the free crossword maker http://www.keiranking.com/apps/phil/ but I haven't had the best luck with its automated fill. Lately I've been using Crossfire http://beekeeperlabs.com/crossfire/ to make puzzles and its auto fill is quite versatile, even if some of the choices can be iffy (but easily fixed with some manual tweaking). You can even provide your own custom dictionary for it to use in the fill.

link

weeblewobble 1723 days ago

The Phil auto-fill button doesn't do anything for me. Crossfire looks like it has this feature though, and it's probably a lot faster than mine

link

peab 1724 days ago

I did the same thing, as a side project when covid first hit. It's a fun and difficult problem to solve efficiently. I remember seeing some videos on YouTube of algorithms that could do it as well.

link

weeblewobble 1723 days ago

nice, what did you end up going with? I built a "scoring" function for candidates based on how many other words can intersect each letter (based on grid layout and squares already filled) plus a backtracking system. ended up working pretty well, although each grid took >1 minute. I used an NYT historical word list as the dictionary

link

peab 1723 days ago

interesting - I was trying to think of a backtracking system as well, but wasn't able to create something efficient. Did it work well for large dense crosswords?

link

eutectic 1723 days ago

I had good success using a DAWG for pruning, and letter-wise (rather than word-wise) branching, picking the letter with fewest options. If you want to get fancy, you can use back-jumping instead of backtracking.

link

pwdisswordfish0 1723 days ago

I’ve used QXW very successfully.

The trick is to have a good word list… I’m working on one for German, but it’s a bit of an undertaking and I guess pretty subjective.

link

bvm 1724 days ago

> The crossword solver is a closed system—it can’t just Google the answers.

yet from the wiki:

> Probabilities for individual words or phrases in the puzzle are computed using relatively simple statistical techniques based on features such as previous appearances of the clue, number of Google hits for the fill. https://en.wikipedia.org/wiki/Dr.Fill

I don't think it's fair to describe it as a closed-system if it's Googling stuff, even for pruning.

link

karamanolev 1724 days ago

Can't the number of Google hits for all words in the dictionary be enumerated and stored before entry into the competition? Wouldn't that keep it a closed system, even if it uses metrics like that?

link

wongarsu 1724 days ago

Humans are allowed to prepare however they like, so anything from the real world should be fair game as long as you don't add anything during the solving process itself.

link

SamBam 1724 days ago

It's not googling stuff live, it's using a pre-computed database.

link

EGreg 1724 days ago

Sounds like how AlphaGo beat Stockfish or Rybka… massive precomputing vs alphabeta search in real time

link

sillysaurusx 1724 days ago

Hmm. AlphaGo and AlphaZero were trained using self play. I don't think there was any precomputing, unless you count training as precomputation.

link

EGreg 1723 days ago

Precomputation on a huge cluster for a long time

link

dom111 1724 days ago

https://archive.is/OuoeZ

link

isolli 1724 days ago

"But, unlike more than 200 human solvers, it wasn’t perfect on all of the puzzles: It got waylaid on two of them and finished with errors."

I'm curious, what kind of error are we talking about? Words that don't exist, or another solution to a problem that may not have a unique solution?

link

tzs 1724 days ago

I wonder how it does on puzzles where the answers need to be written in unusual ways? Some examples from the New York Times puzzles.

- There was one with a name that suggested an Alice in Wonderland connection, and it had an answer "THE LOOKING GLASS" (no spaces) running vertically down the full length of the center of the grid.

Every across answer that was entirely to the left of that was written normally. Every across answer entirely to the right was written backwards. Every across answer that crossed the center was a palindrome centered on the center.

- There was one where several answers were triple Spoonerisms of well known phrases.

For example, the answer "THE STUCK HOPS BEER" for the clue "Tagline in an ad for Elmer's Glue-Ale". Rotate the ST from STUCK, the H from HOPS, and the B from BEER and you get "THE BUCK STOPS HERE".

- I remember one that had a few isolated black squares, and a theme that suggested the puzzle had something to do with roundabouts.

Those black squares were roundabouts. Answers would hit the roundabout and continue after a 90 degree turn.

- I remember one where the theme was something like "What goes up must come down". That had several answers that, like the roundabout one above, would take a 90 degree turn from across but it was a left turn so after the turn they went up. Then there would be a down answer whose start was the reversed end of that across answer that would come down, also make a left tern, and continue across to the right.

link

uhoh-itsmaciek 1724 days ago

Those are great. Another memorable NYT one had the theme A SHOT IN THE DARK, and several answers ended in SHOT (RIMSHOT, EARSHOT, etc.), but the grid only had space for the first part of the word, so the actual answers were RIM⬛, EAR⬛, etc. Another had a RISE FROM THE ASHES theme, and had several answers that ended in ASH that did not seem to fit the clue: the actual answer followed vertically where the ASH ending started. E.g., for the clue "Tell it like it is", the answer that fit straight across was "TALKS TRASH", but above the "A", going up, were I, G, H, and T, giving the actual answer "TALK STRAIGHT".

link

ptero 1724 days ago

Non-existent words should be easy to check for. My guess is answers not (quite) matching the clues. Analogies and cultural references are probably hard for AI to get: is an "empire fighting knight" historical figure or Jedi? One level of such referencing is doable, 2-3 likely very hard.

link

tasogare 1724 days ago

> Non-existent words should be easy to check for.

I'm interested in your magical solution because this is a hard problem.

link

tsimionescu 1724 days ago

You should have given more detail on why this problem is deceptively hard. I am guessing that the simple solution of looking up the word in the dictionary seems to work ok (especially in the context of an artificial competition, which doesn't have to accept uncommon spellings, words in other languages etc), but still breaks down hard because of proper names, which are common in cross-words.

link

SamBam 1724 days ago

I think the point is that "Non-existent words should be easy to check for" is a useless metric, because plenty of answers are non-existent words. Proper names and things like that, many of which the computer may not have in its database.

So it can't simply blindly reject answers that are non-existent words.

An example of how this computer could make a mistake: Sometimes two proper names are crossing at a vowel. If you don't know either name, you sometimes have to guess blindly at the answer. (This exact scenario is rare in crosswords like the Times, but do occasionally come up, and similar scenarios exist.)

link

ptero 1724 days ago

I would definitely be interested in knowing why it is hard. And yes, I was thinking about lookup: for example from (1) a dictionary, (2) a set of proper names and (3) a set of previous crossword answers. While nothing is perfect, but this (from an armchair) seems like it should work pretty well. And I am not proposing it as the main part of the algorithm, just a check on non-words.

link

tasogare 1724 days ago

There is no dictionary that contains "all existing words" for any giving language.

link

tsimionescu 1723 days ago

It depends a lot on what you mean by a language. Iglf you define 'the English language' as 'all of the words that some amount of people who identify as speaking English would understand', then of course there is no dictionary that would cover that (but by this definition, many words in the English language are Indian, Chinese, Romanian, Russian, etc, and would be completely incomprehensible to the vast majority of people in the USA or England). On the other hand, many people define the concept of a 'correct English word' as 'any word with a definition in the OED or Merriam Webster (ignoring proper nouns)', and leave other words as being 'wrong/foreign language'.

Either way, this is all moot when discussing a crossword puzzle contest, which explicitly limits itself to words in a specific dictionary + proper nouns. The problem of proper nouns is still extreme, and brings down the whole idea, but at least the problem of recognizing 'all possible common nouns that could be present in the crossword contest' is simple.

link

banana_giraffe 1724 days ago

A similar thing happens even with human solvers from time to time. It's possible to solve a puzzle with answers that match the clue but aren't what was intended. Of course, the more you let yourself deviate from the clue, the more possibilities exist.

My favorite example of using this is the 1996-11-05, the day of the presidential election, NY Times crossword with the clue "Lead story in tomorrow's newspaper (!), with 43-Across". The puzzle worked with both "CLINTON" or "BOBDOLE" in the crossword.

link

djrockstar1 1724 days ago

Also, as a novice to crosswords, how do puzzle makers ensure that they have a unique solution?

link

OscarCunningham 1724 days ago

Related, this crossword designed to have two solutions: https://imgur.com/Atd5Dry

link

bloak 1724 days ago

In a traditional cryptic crossword each individual word has at least two independent things in the clue referring to it, for example, in an unrealistically simple case, "it means X" and "it's an anagram of Y", so each individual answer is very likely to be unique. In addition, alternate letters of each answer are part of another word, so the whole thing is extremely likely to have a unique solution.

However, with a combination of a bad clue and a bit of bad luck it does sometimes happen that an experienced crossword solver has to take a guess before sending in their solution. I'm not an experienced crossword solver, but I've seen them at work, and I would guess that finding a good alternative solution happens less than one time in 50. That's just a guess, though. For a better estimate one should systematically compare the solutions produced by competent crossword solvers with the official solutions.

link

tsimionescu 1724 days ago

In a crossword puzzle, each word shares letters with at least two other words, likely many more. So, without any effort from the puzzle desogner, it is very unlikely to actually have all of the words have ambiguous enough solutions that match the letter count etc. Even a crossword where half the clues are 'any word' will have a good chance of having a unique solution overall.

Of course, people don't like to play crosswords like sudoku, blindly matching letters, so there does have to be some skill to the clue design to have a relatively small amount of plausible answers, if not being entirely unique.

link

tclancy 1724 days ago

They don’t necessarily. Some times this is done on purpose at harder difficulties, the clueing is intentionally vague to lead you down blind alleys.

link

wirthjason 1724 days ago

For some reason only the first paragraph was shown. I had to view it in privacy mode to see the whole article. I wonder if the ad blocker on my iPhone triggered something.

link

wirthjason 1724 days ago

This makes me curious, how are crossword puzzles made? Clues -> words? Or working backwards from words to clues? Some blend of the two?

link

banana_giraffe 1724 days ago

The crossword writer I've seen discuss it, generally start with theme words, then fill in the rest of the words, then work on clues.

Then an editor will refine the clues to fit whatever criteria they may have for publication.

link

ygra 1724 days ago

There was a video next to the article for me that discussed exactly that ... https://youtu.be/aAqQnXHd7qk

link

eutectic 1724 days ago

I wonder how well gpt-3 + search would perform as a crossword-solving agent. I guess the tokenization might be tricky to deal with.

link