Hacker News new | ask | show | jobs
by AlbertVAustin 1356 days ago
> Even if we did do that, I've done some research on measuring difficulty of Sudoku -- many Sudoku grids are unsolvable by humans -- they are far too hard.

> A different, and interesting, question would be how many Sudoku are there which can be considered human-solvable. That's an even more open problem (we'd have to exactly define human-solvable to start).

What does it mean for a sudoku to be non solvable by human? I wrote a sudoku solver once and thought humans solving harder sudokus is basically exactly what a computer would do: guessing number and trying to solve it further, backtracking on failure. Or do you mean non-human solvable as "it would take a lot of time for a human to solve it"?

6 comments

Not the GP, but I think in general the “guess and check” strategy frowned-upon by the Sudoku community. Ideally you should be able to make progress by making some logical observation (a trivial example: the numbers 1-8 are present in this column, therefore the remaining cell in this column is a 9). Humans can carry this out to complex patterns (the “swordfish” involves looking at 6 cells in 3 different rows/columns and inferring that cells visible to those cannot take on particular values), but it is seems reasonable to say that patterns will reach complexity that humans can’t hold in their heads. Obviously, any such classification would be subjective, but there’s probably a “reasonable” limit to solvability. (Without guess and check)
Obviously, any such classification would be subjective, but there’s probably a “reasonable” limit to solvability. (Without guess and check)

It’s totally subjective. Since sudoku is a closed (fixed size grid, fixed number of symbols), guess-and-check is just a catch-all label for logical inference strategies which haven’t been named yet. New strategies are discovered (and named) all the time and one of the ways to do this is to guess and then, if the guess was correct, go back and try to understand why.

This bias against guess and check seems to be some deep-seated issue from our culture. I know a lot of mathematics teachers frown on guess-and-check as well and that can rub off on their students, possibly implanting the bias for life. Unfortunately, having this bias can really damage a person’s ability to learn and succeed at mathematics in university. It turns out that guess-and-check makes a triumphant return as a strategy for quickly completing proofs when one is unfamiliar with established theory. Sound familiar? That’s just like sudoku!

Working mathematicians, in contrast to poor math teachers at lower levels, have a healthy relationship to the guess-and-check strategy. And that’s good news for them, since they are often working in areas where there is insufficient established theory to make any progress.

It's all down to whether or not there is any logic to the guess and check strategy. Most of the computer algorithms work by picking a number in the possible values, following the eliminations, and repeating. If they hit an error, they backtrack to the last decision, choose a new number from the set and repeat.

If you look at some of the Cracking the Cryptic puzzles on YouTube with "computer" in the title, that covers cases where the computer resorts to a guess and check strategy but where better underlying logic is available in the solve path.

The finite nature of sudoku puzzles guarantees that there is an underlying logic to a given move, whether the solver is aware of it or not. Ultimately, all sudoku strategies are backed by an overriding master logic:

"Does this branch of the game tree contain the completed solution or not?"

Whether or not humans have come up with a name for a given strategy depends, almost entirely, on the depth of that branch.

No that’s not the case. Having a single solution does not mean that there is a logical path to it.

If I have a puzzle where I’m down to a bunch of different options and the only way to make progress is to just try fixing different numbers until the puzzle resolved that not a logical progression , it’s an exhaustive search.

If the solution is unique then there is a logical path to it, though maybe not one feasible to discover for an agent with bounded time and (especially for humans) bounded working memory.

Consider the search tree of a backtracking algorithm operating on a SAT-based representation of a puzzle. A leaf of the tree corresponds to some clause that becomes empty after eliminating literals assigned on the path to that node. If you flip the tree upside down and start from the clauses at the leaves, then two leaves with a common parent represent two clauses with complementary literals that can be resolved upon. So any branch of the algorithm's search tree can be viewed as a tree-structured resolution-refutation proof eliminating some value given a set of prior assignments on the path to the branch, and the tree as a whole can be decomposed into an ordered set of proofs for each elimination in the path to an overall solution.

Or, if you prefer, start again from the SAT-based representation and run a prime implicates algorithm like Tison's. Such an algorithm never advances a "guess" or generates a clause that is not entailed. It simply finds one logical consequence after another until it has found all of them, throwing away subsumed clauses along the way until all that's left is the unique solution.

Either way you will always produce a path to a solution and proof that the solution is unique using elementary rules of propositional logic that anyone can recognize. That, I think, is what your parent means.

Of course, the existence of a proof doesn't mean that a human can find that proof, and that's what we usually mean when we say there's no logical solution. That is, nothing new can be proven by scanning the puzzle looking for pattern matches against a fixed library of known proof templates taken to be the universe of logical techniques, and nothing can be proven from less structured heuristic search within the limits of human working memory and patience.

guess and check for a human works when the problem space is small enough. You can't guess and check yourself through a 300-move forced mate in chess or some bizarre sudoku before dying of old age.

Yeah in theory a human can brute force or guess like a computer, but that's obviously not the point of a meaningful game or problem to solve because it's mindless.

Nothing about the guess and check strategy implies that you have zero prior information and must guess blindly in an attempt to win the lottery for picking the correct move. Typically, people using guess and check have other principles, rules, or heuristics they use to narrow the search space down to a handful of options.

My original point still stands. A cultural bias against guess-and-check is just another way of saying that strategies which determine an exact sequence of moves using only strict deduction from known theory are the only “acceptable” way to play sudoku. That, to me, is highly limiting and absurd. It’s also inefficient because often you’re faced with a choice between two mutually exclusive options and it’s just plain faster and less error-prone to guess and check than it is to carry out all of the deduction in your head before making the correct move.

it's not an absurd limitation. You're not playing sudoku to beat an opponent (idk maybe there's competitive sudoku where people guess if it's most efficient), you're playing to get better at reasoning. Deduction literally is the game, not filling squares with numbers.

If you're doing programming exercises you can brute force or guess a solution but there's a cultural bias against doing that, because what you're trying to learn is how to be smart about the problem, otherwise the exercise is pointless.

Guessing never adds anything to the experience in a game that's deliberately set up to be a test of skill. It's like doing an actual puzzle. Yes you can trial and error the edges of the pieces randomly, maybe that's faster, but it defeats the purpose of puzzling. A puzzle that heavily benefits from guesswork is just a bad puzzle.

If you're doing programming exercises you can brute force or guess a solution but there's a cultural bias against doing that, because what you're trying to learn is how to be smart about the problem, otherwise the exercise is pointless.

What exactly counts as “being smart about the problem” is culturally determined. In my view, taking ten times as long to solve the problem using pure deduction (because you’re struggling to hold all of the possibilities in your head simultaneously) is being less smart about the problem than making an informed guess and then quickly and mechanically checking whether you were correct.

A puzzle that heavily benefits from guesswork is just a bad puzzle.

That puts an upper limit on the difficulty rating of puzzles which is in some sense limited by the imagination of solvers.

IOW: intelligence is data compression.
Intelligence is the ability to craft a more optimal solution than a brute force approach would take.

In the domain of data compression, Kolmogorov complexity https://en.wikipedia.org/wiki/Kolmogorov_complexity is one of the approaches to try to measure that amount.

Guess-and-check is not the same as brute force (nor exhaustive search). No one is sitting there with an empty sudoku grid, filling the whole thing in with guessed numbers, checking to see if the solution is valid, then erasing everything and starting over. Literally no one does that, not even software sudoku solvers (it takes way too long).

Guess-and-check is literally just another name for backtracking algorithms. These do not work (for humans) without an adequate suite of rules that can be use to prune the search space.

> New strategies are discovered

I highly doubt it. In Sudoku, it is either logically deducible that a particular value is to be placed into a given square. Or else it isn't; one can deduce that one of several values can be placed there, after all other exploration elsewhere has been exhausted.

Deducing what can be placed into what square is a simple elimination. If the square is in a row that already has 7, 7 cannot be placed there. If it's in a 3x3 square that has 5, 5 cannot be placed there. If it's in a column that has 4, 4 cannot be placed there. For every square of the Sudoku board, we can do this evaluation. We hit upon a square where all values are eliminated but one, and so that one is confirmed. Based on that confirmation, we can update the other eliminations and keep going. When this process cannot continue, there is no choice but to guess. OK, here we can have a 7 or a 4, and so it goes.

I don't see how you can pretend to have some strategies here, let alone ones with pet names and all.

People have different systems for tracking the information, to be sure.

Sure, you can always use backtracking to solve any sudoku, but people come up with new, more effective strategies all the time, eg. https://www.youtube.com/watch?v=ZLcey7qiXv8
What is "more effective" than "solve any"?
I think there's some leeway for bifurcation, so "guess and check" when there's only two possibilities. It's not as "clean", but from what I gather it's commonplace in competitive sudoku because in some situations it's the fastest way to find the next digit.

But obviously to get to bifurcation you still need to infer enough to narrow a cell down to two choices, so that would still put a limit on solvability.

I would not call what the GP describes as ”guess and check”, but as “backtracking”.

That is just making logic observations, using the grid as a memory aid (“let’s see, if foo is a 1 and bar is a 2, then that’s a 7, and then that must be a 4, and that a 3, and then we hit a dead end, so the claim “this is a 1 and this is a 2” in’s true. Maybe foo is a 1 and bar is a 3? No, that’s clearly incorrect. Maybe foo is a 1 and bar is a 4? Etc)

I think human Sudoku solving is different, though. Humans don’t do a depth-first search, but a breadth-first one for paths that soon lead to being able to fill in some squares. Experienced solvers probably have heuristics for finding them.

I think I would rate the difficulty of a puzzle by the ¿average? depth needed in such a depth-first search to permanently place a new digit in a new cell, with correction factors for the number of paths at that depth that do that and the number of cells to fill in.

Generally if a puzzle requires bifurcation it’s considered to be at least partially broken.

The setter is expected to have made it so that determining each cell can be done without simply seeing if a choice eventually succeeds. Of course exactly how many steps to count as bifurcation is fairly nebulous.

> frowned-upon by the Sudoku community

{citation-needed}

What community, and does it include mathematicians and logicians?

Sudoku puzzles that are solvable without guessing must be generated in a way that this is assured.

There is also an understanding that Sudoku puzzles should have exactly one solution.

It's possible for Sudoku puzzles to have a unique solution, yet be impossible to solve without guessing and backtracking. The process of deduction reaches a point at which it cannot be deduced what value to fill next into what square. It is narrowed down to several possibilities, which have to be pursued in parallel; e.g with backtracking. You follow all the paths, and discard them, returning to the divergent point, whenever a contradiction is reached: violation of the Sudoku constraints.

You can't just frown upon some solution method which is inescapable.

A human with sufficient pens and paper available is a (time-constrained) universal Turing machine, so the only Sudokus that aren't solvable by humans are those that take longer than a few decades to solve. In practice, humans don't like spending that much time on solving a single puzzle. And to most humans emulating a Turing machine is also not particularly enjoyable.
There are lots of Sudokus wuth multiple correct solutions, and no way to differentiate them.
These shouldn't be considered valid sudoku puzzles. Uniqueness is an attribute of valid puzzles.
Generally people solve Sudokus in almost the exact opposite way.

They do very clever reasoning (much more advanced than basically any computer sudoku solver), but very little, or no, backtracking. This is mainly because backtracking more than 1 or 2 times becomes unreasonable (do you start making hundreds of photocopies of your Sudoku?)

The reason computer solvers don't do the advanced reasoning which humans do is it turns out it isn't worth it -- by the time you find and apply the reasoning, you can have done a few million backtracks, which will usually have solved the problem anyway.

I once wrote a Sudoku solver using the following algorithm: For each empty field, find which values could possibly be written there (which values are not already written in the same row/column/quadrant). Then it applied the following two rule: (1) If there is only one possible value for a field, write it there. (2) For each row/column/quadrant, for each value: if the value can possibly be written only in one field, write it there. Repeat this in a while loop, finish if you checked the rules and couldn't write anything.

No recursion, constant memory usage.

I assumed that this would not be enough. I was like "first I will code the obvious thing, then try the algorithm on examples from a book I bought, find the examples this cannot solve, and then think about further rules which could solve those examples". But as I tried it on the puzzles in the book, it solved all of them.

Which went against my intuition, as I was pretty sure I sometimes use more complicated rules when solving Sudoku by hand. But it turned out that it is somehow really simply for me to miss an opportunity.

This is of course not a proof that each uniquely solvable Sudoku can be solved by this algorithm, but I think that in practice at least 99% of them can be.

Then I tried an opposite thing, to use this solver to create puzzles. Like, create a full board, then try removing random numbers and check if the solver can solve this; stop when no number can be removed. This actually created pretty hard puzzles; harder than any example in my book.

Search for 'minimal Sudoku' I think it's 17 given numbers and try those on your solver. I wrote a backtracking solver which tried on the fields with least possibilities first. It needed considerable time for these puzzles.
This is a fun way of solving, but I think your book was fairly simple :)

A good way to find hard sudoku, and hard techniques, is "sudokuwiki.org".

The process you're describing is commonly (and usually derisively) referred to as "bifurcation," because experience solvers will generally only resort to "guessing" when there are only two possible values for a cell.

Strategies for solving sudoku puzzles are many and varied, but if a puzzle can only be solved by "guessing a number" rather than logically working out where a number goes or which number goes in a cell, it's probably a pretty bad puzzle.

Puzzles which require guessing at any point are generally considered "not solvable". The goal of solving a sudoku puzzle is not actually to get a grid with 81 numbers in it, and trying to progress by guessing defeats the whole point.