I would be interested to see how they perform on UK-style cryptic crosswords (https://en.wikipedia.org/wiki/Cryptic_crossword) where there's much more play/subversion/meta-approach to the "rules".
I think it's pretty unlikely you'll be able to learn the UK rules to a human level using brute force ML - take a fairly easy clue like ‘Drunken men are more despicable' - you need to split this as (wordplay = drunken (anagram of) "men are") / (meaning = "more despicable") = "MEANER"
This is really hard for supervised learning - the reward is quite sparse (e.g. did you get it right / how many characters did you get right) but the task is reasonably complex (e.g. it would have to learn how to spot/execute arbitrary length anagrams on its own, already something that is nontrivial for ML). Sparse + complex usually means gradient descent will fail or converge to a more trivial minima e.g. only look for synonyms in the clue.
I reckon you would have to codify the different types of cryptic clues manually for this work well.
Actually, one of the interesting things about cryptic crosswords is that, in some ways, the rules are far more codified.
The specifics definitely depend on the paper, of course, but basics like the fact that the majority of the clues are actually two halves, clueing the word two different ways, and that anagrams are always explicitly clues using words such as "scrambled," "drunken," "off" or whatever could actually make it easier for the human to guide the ML.
well yes, but the beautiful thing about UK cryptics at least is that they regularly intentionally misdirect the user, so you spend some time going down a rabbit hole and realise that you're on a hiding to nothing. I think that kind of game play is what makes a rules-based approach more challenging. I'll have to check out Crossword Genius though.
You might not need to codify the rules; you could create a tagged training set which includes additional information, like a ‘parts of speech’ breakdown of how a clue relates to an answer (anagramSignifier - anagramMaterial - association, etc.)
An unsupervised learner could probably do reasonably well at picking up on those patterns, even to the point of considering that a word it hasn’t seen used as an anagram signifier before might be doing that job in a particular clue.
On the other hand, machine translation learning has moved away from using tagged parts of speech as far as I’m aware, and it has nonetheless managed to develop sufficient internal modeling that it is as if it has learned parts-of-speech tagging; it’s possible that an ML on a cryptic clue corpus could develop those same hidden models.
A few years back I wrote a program that, given a blank or partially-filled in (NYT-style) crossword grid (with black squares already inserted), could fill out the rest of the grid with valid words/phrases both across and down. It had no relation to clues though, you had to write the clues yourself after the grid was filled out.
I wrote it because I wanted to make my dad (a huge nyt crossword fan) a custom crossword for his birthday. I put in a bunch of phrases related to him and our family and let the program fill in the rest. It was a huge hit, never really went back to it though. Anyone know if anything else like this exists?
Don't use that as a final product, use it as a crossword book generator. Selling the books would be a lot more lucrative (or selling the actual puzzles to someone who already prints crosswords.)
Especially if you can generate them from arbitrary lists of of clues:words.
There is Phil, the free crossword maker http://www.keiranking.com/apps/phil/ but I haven't had the best luck with its automated fill. Lately I've been using Crossfire http://beekeeperlabs.com/crossfire/ to make puzzles and its auto fill is quite versatile, even if some of the choices can be iffy (but easily fixed with some manual tweaking). You can even provide your own custom dictionary for it to use in the fill.
I did the same thing, as a side project when covid first hit. It's a fun and difficult problem to solve efficiently. I remember seeing some videos on YouTube of algorithms that could do it as well.
nice, what did you end up going with? I built a "scoring" function for candidates based on how many other words can intersect each letter (based on grid layout and squares already filled) plus a backtracking system. ended up working pretty well, although each grid took >1 minute. I used an NYT historical word list as the dictionary
interesting - I was trying to think of a backtracking system as well, but wasn't able to create something efficient. Did it work well for large dense crosswords?
I had good success using a DAWG for pruning, and letter-wise (rather than word-wise) branching, picking the letter with fewest options. If you want to get fancy, you can use back-jumping instead of backtracking.
> The crossword solver is a closed system—it can’t just Google the answers.
yet from the wiki:
> Probabilities for individual words or phrases in the puzzle are computed using relatively simple statistical techniques based on features such as previous appearances of the clue, number of Google hits for the fill.
https://en.wikipedia.org/wiki/Dr.Fill
I don't think it's fair to describe it as a closed-system if it's Googling stuff, even for pruning.
Can't the number of Google hits for all words in the dictionary be enumerated and stored before entry into the competition? Wouldn't that keep it a closed system, even if it uses metrics like that?
Humans are allowed to prepare however they like, so anything from the real world should be fair game as long as you don't add anything during the solving process itself.
I wonder how it does on puzzles where the answers need to be written in unusual ways? Some examples from the New York Times puzzles.
- There was one with a name that suggested an Alice in Wonderland connection, and it had an answer "THE LOOKING GLASS" (no spaces) running vertically down the full length of the center of the grid.
Every across answer that was entirely to the left of that was written normally. Every across answer entirely to the right was written backwards. Every across answer that crossed the center was a palindrome centered on the center.
- There was one where several answers were triple Spoonerisms of well known phrases.
For example, the answer "THE STUCK HOPS BEER" for the clue "Tagline in an ad for Elmer's Glue-Ale". Rotate the ST from STUCK, the H from HOPS, and the B from BEER and you get "THE BUCK STOPS HERE".
- I remember one that had a few isolated black squares, and a theme that suggested the puzzle had something to do with roundabouts.
Those black squares were roundabouts. Answers would hit the roundabout and continue after a 90 degree turn.
- I remember one where the theme was something like "What goes up must come down". That had several answers that, like the roundabout one above, would take a 90 degree turn from across but it was a left turn so after the turn they went up. Then there would be a down answer whose start was the reversed end of that across answer that would come down, also make a left tern, and continue across to the right.
Those are great. Another memorable NYT one had the theme A SHOT IN THE DARK, and several answers ended in SHOT (RIMSHOT, EARSHOT, etc.), but the grid only had space for the first part of the word, so the actual answers were RIM⬛, EAR⬛, etc. Another had a RISE FROM THE ASHES theme, and had several answers that ended in ASH that did not seem to fit the clue: the actual answer followed vertically where the ASH ending started. E.g., for the clue "Tell it like it is", the answer that fit straight across was "TALKS TRASH", but above the "A", going up, were I, G, H, and T, giving the actual answer "TALK STRAIGHT".
Non-existent words should be easy to check for. My guess is answers not (quite) matching the clues. Analogies and cultural references are probably hard for AI to get: is an "empire fighting knight" historical figure or Jedi? One level of such referencing is doable, 2-3 likely very hard.
You should have given more detail on why this problem is deceptively hard. I am guessing that the simple solution of looking up the word in the dictionary seems to work ok (especially in the context of an artificial competition, which doesn't have to accept uncommon spellings, words in other languages etc), but still breaks down hard because of proper names, which are common in cross-words.
I think the point is that "Non-existent words should be easy to check for" is a useless metric, because plenty of answers are non-existent words. Proper names and things like that, many of which the computer may not have in its database.
So it can't simply blindly reject answers that are non-existent words.
An example of how this computer could make a mistake: Sometimes two proper names are crossing at a vowel. If you don't know either name, you sometimes have to guess blindly at the answer. (This exact scenario is rare in crosswords like the Times, but do occasionally come up, and similar scenarios exist.)
I would definitely be interested in knowing why it is hard. And yes, I was thinking about lookup: for example from (1) a dictionary, (2) a set of proper names and (3) a set of previous crossword answers. While nothing is perfect, but this (from an armchair) seems like it should work pretty well. And I am not proposing it as the main part of the algorithm, just a check on non-words.
It depends a lot on what you mean by a language. Iglf you define 'the English language' as 'all of the words that some amount of people who identify as speaking English would understand', then of course there is no dictionary that would cover that (but by this definition, many words in the English language are Indian, Chinese, Romanian, Russian, etc, and would be completely incomprehensible to the vast majority of people in the USA or England). On the other hand, many people define the concept of a 'correct English word' as 'any word with a definition in the OED or Merriam Webster (ignoring proper nouns)', and leave other words as being 'wrong/foreign language'.
Either way, this is all moot when discussing a crossword puzzle contest, which explicitly limits itself to words in a specific dictionary + proper nouns. The problem of proper nouns is still extreme, and brings down the whole idea, but at least the problem of recognizing 'all possible common nouns that could be present in the crossword contest' is simple.
A similar thing happens even with human solvers from time to time. It's possible to solve a puzzle with answers that match the clue but aren't what was intended. Of course, the more you let yourself deviate from the clue, the more possibilities exist.
My favorite example of using this is the 1996-11-05, the day of the presidential election, NY Times crossword with the clue "Lead story in tomorrow's newspaper (!), with 43-Across". The puzzle worked with both "CLINTON" or "BOBDOLE" in the crossword.
In a traditional cryptic crossword each individual word has at least two independent things in the clue referring to it, for example, in an unrealistically simple case, "it means X" and "it's an anagram of Y", so each individual answer is very likely to be unique. In addition, alternate letters of each answer are part of another word, so the whole thing is extremely likely to have a unique solution.
However, with a combination of a bad clue and a bit of bad luck it does sometimes happen that an experienced crossword solver has to take a guess before sending in their solution. I'm not an experienced crossword solver, but I've seen them at work, and I would guess that finding a good alternative solution happens less than one time in 50. That's just a guess, though. For a better estimate one should systematically compare the solutions produced by competent crossword solvers with the official solutions.
In a crossword puzzle, each word shares letters with at least two other words, likely many more. So, without any effort from the puzzle desogner, it is very unlikely to actually have all of the words have ambiguous enough solutions that match the letter count etc. Even a crossword where half the clues are 'any word' will have a good chance of having a unique solution overall.
Of course, people don't like to play crosswords like sudoku, blindly matching letters, so there does have to be some skill to the clue design to have a relatively small amount of plausible answers, if not being entirely unique.
For some reason only the first paragraph was shown. I had to view it in privacy mode to see the whole article. I wonder if the ad blocker on my iPhone triggered something.