Hacker News new | ask | show | jobs
by taeric 3334 days ago
You can actually make a sudoku problem where, by the rules of the game, all open pieces have at least 2 possible values while still having a unique solution. So, to that end, I am unaware of how you could make a solver that doesn't have to guess. (As noted in a sibling post, I wouldn't be shocked to find I was wrong, but I would be interested in the details.)

So, I'm ultimately wary of the claims here. The system has to be doing a search. It may be crafted in such a way that it just walks straight to the solution for some specifications, but that is as much from luck of easy constraints as it is the algorithm.

1 comments

When you say "all open pieces have at least 2 possible values", how can the solution be unique? Surely for each piece, some values are not admissible, otherwise this would be a contradiction. Sufficiently strong pruning could have eliminated them, either by search or by more reasoning.
First, apologies for going to sleep on you. Huge thanks for sticking with this and opening my eyes to this branch of programming. If you have recommended texts, is appreciate it.

For this question, I can really only refer to Knuth. This is exercise 515 of 7.2.2.2. I likely just messed you up on the wording. There are no forced moves for open squares. As soon as you pick one, though, it may force all others.

In the solution of this exercise, 4 example puzzles are given. Here is the first one:

    knuth(a, [[1,_,_,_,_,6,_,8,_],
              [5,_,8,7,2,1,4,_,6],
              [_,6,_,3,8,_,2,_,1],
              [8,4,_,_,_,3,_,_,5],
              [_,_,5,_,6,_,8,_,_],
              [6,_,_,8,_,_,_,4,2],
              [3,_,6,_,4,8,_,2,_],
              [4,_,7,6,3,2,1,_,8],
              [_,8,_,5,_,_,_,_,4]]).
Consider now a Prolog solution for 9x9 (i.e., "normal") Sudoku puzzles, using integer constraints:

    sudoku(Rows) :-
            length(Rows, 9), maplist(same_length(Rows), Rows),
            append(Rows, Vs), Vs ins 1..9,
            maplist(all_distinct, Rows),
            transpose(Rows, Columns), maplist(all_distinct, Columns),
            Rows = [As,Bs,Cs,Ds,Es,Fs,Gs,Hs,Is],
            blocks(As, Bs, Cs), blocks(Ds, Es, Fs), blocks(Gs, Hs, Is).

    blocks([], [], []).
    blocks([N1,N2,N3|Ns1], [N4,N5,N6|Ns2], [N7,N8,N9|Ns3]) :-
            all_distinct([N1,N2,N3,N4,N5,N6,N7,N8,N9]),
            blocks(Ns1, Ns2, Ns3).
This is indeed exactly a case where search is necessary, because the pruning applied by this concrete constraint solver is not strong enough to deduce the unique solution. If we only post the constraints, we get:

    ?- knuth(a, Rows),
       sudoku(Rows),
       maplist(portray_clause, Rows).
    [1, _, _, _, _, 6, _, 8, _].
    [5, _, 8, 7, 2, 1, 4, _, 6].
    [_, 6, _, 3, 8, _, 2, _, 1].
    [8, 4, _, _, _, 3, _, _, 5].
    [_, _, 5, _, 6, _, 8, _, _].
    [6, _, _, 8, _, _, _, 4, 2].
    [3, _, 6, _, 4, 8, _, 2, _].
    [4, _, 7, 6, 3, 2, 1, _, 8].
    [_, 8, _, 5, _, _, _, _, 4].

However, it is exactly as you say: Even assigning a single variable a concrete value automatically enforces the solution by constraint propagation alone:

    ?- knuth(a, Rows),
       sudoku(Rows),
       Rows = [[_,Var|_]|_],
       indomain(Var),
       maplist(portray_clause, Rows).
    [1, 2, 3, 4, 5, 6, 7, 8, 9].
    [5, 9, 8, 7, 2, 1, 4, 3, 6].
    [7, 6, 4, 3, 8, 9, 2, 5, 1].
    [8, 4, 2, 1, 9, 3, 6, 7, 5].
    [9, 7, 5, 2, 6, 4, 8, 1, 3].
    [6, 3, 1, 8, 7, 5, 9, 4, 2].
    [3, 1, 6, 9, 4, 8, 5, 2, 7].
    [4, 5, 7, 6, 3, 2, 1, 9, 8].
    [2, 8, 9, 5, 1, 7, 3, 6, 4].
Note though the general point: A sufficiently strong constraint solver could have deduced the unique solution even without trying any concrete value, exclusively by reasoning about the posted constraints with sufficient sophistication! This concrete solver can't, due to the particular trade-off between efficiency and pruning strength that was chosen by the implementor, and so search must settle the remaining part.

One of the other puzzles is:

    knuth(d, [[1,_,3,_,5,6,_,8,9],
              [6,8,_,3,_,9,1,_,5],
              [_,9,5,1,8,_,6,3,_],
              [3,_,8,9,6,_,_,5,1],
              [_,1,9,5,_,8,3,6,_],
              [5,6,_,_,3,1,9,_,8],
              [_,5,6,_,9,3,8,1,_],
              [8,_,1,6,_,5,_,9,3],
              [9,3,_,8,1,_,5,_,6]]).
In this case, the solution is not unique. We can use the constraint solver to count the number of admissible solutions, again using search:

    ?- knuth(d, Rows),
       sudoku(Rows),
       findall(., maplist(labeling([ff]), Rows), Ls),
       length(Ls, L).
Yielding: L = 12.

This indeed shows that search is necessary in general also when applying constraint programming. Still, no such search is applied in the posted article. In the example contained in the article, the constraints alone suffice to determine the unique solution. I leave it as an exercise for the reader to determine whether this is the case for all 4x4 Sudoku puzzles.

Thanks for posting. Wasn't sure if folks were interested, but I've found Knuth's fun with Sudoku to be a lot of fun.

I'm curious on how you claim a strong enough solver could have solved this without a "guess."

Also, any thoughts on how treating them as constraint propagation compares to casting it as an exact cover problem?

Regarding the strength of different solvers: The solution I posted is just one out of many possible constraint-based formulations of Sudoku, using one particular domain and one particular solver implementation. In this concrete case, I have used constraint logic programming over integers, also known as CLP(FD), which is available in many Prolog systems. This is one of the most important and most prominent applications of constraint logic programming and even of logic programming in general.

As an alternative, we can just as well model Sudoku puzzles as combinatorial tasks over Boolean variables, and hence use CLP(B), constraint logic programming over Booleans.

For example:

    sudoku(Rows) :-
            length(Rows, 9), maplist(same_length(Rows), Rows),
            maplist(row_booleans, Rows, BRows),
            maplist(booleans_distinct, BRows),
            transpose(BRows, BColumns),
            maplist(booleans_distinct, BColumns),
            BRows = [As,Bs,Cs,Ds,Es,Fs,Gs,Hs,Is],
            blocks(As, Bs, Cs), blocks(Ds, Es, Fs), blocks(Gs, Hs, Is).

    blocks([], [], []).
    blocks([N1,N2,N3|Ns1], [N4,N5,N6|Ns2], [N7,N8,N9|Ns3]) :-
            booleans_distinct([N1,N2,N3,N4,N5,N6,N7,N8,N9]),
            blocks(Ns1, Ns2, Ns3).

    booleans_distinct(Bs) :-
            transpose(Bs, Ts),
            maplist(card1, Ts).

    card1(Bs) :- sat(card([1],Bs)).

    row_booleans(Row, Bs) :-
            same_length(Row, Bs),
            maplist(cell_boolean, Row, Bs).

    cell_boolean(Num, Bs) :-
            length(Bs, 9),
            sat(card([1],Bs)),
            element(Num, Bs, 1).
In this formulation, I use 9 Boolean variables to represent which of the 9 possible integers is assigned to a particular cell.

Let us apply this formulation to puzzle (b) that is shown in the solution of this exercise:

    knuth(b, [[1,_,3,_,5,6,_,8,9],
              [5,9,7,3,8,_,6,1,_],
              [6,8,_,1,_,9,3,_,5],
              [9,5,6,_,3,1,8,_,7],
              [_,3,1,5,_,8,9,6,_],
              [2,_,8,9,6,_,1,5,3],
              [8,_,9,6,_,5,_,3,1],
              [_,6,5,_,1,3,2,9,8],
              [3,1,_,8,9,_,5,_,6]]).
Here is the result:

    ?- knuth(b, Rows),
       sudoku(Rows),
       maplist(portray_clause, Rows).
    [1, 2, 3, 4, 5, 6, 7, 8, 9].
    [5, 9, 7, 3, 8, 2, 6, 1, 4].
    [6, 8, 4, 1, 7, 9, 3, 2, 5].
    [9, 5, 6, 2, 3, 1, 8, 4, 7].
    [7, 3, 1, 5, 4, 8, 9, 6, 2].
    [2, 4, 8, 9, 6, 7, 1, 5, 3].
    [8, 7, 9, 6, 2, 5, 4, 3, 1].
    [4, 6, 5, 7, 1, 3, 2, 9, 8].
    [3, 1, 2, 8, 9, 4, 5, 7, 6].
For comparison, here is the result of the CLP(FD)-based formulation that I posted earlier:

    ?- knuth(b, Rows),
       sudoku(Rows),
       maplist(portray_clause, Rows).
    [1, _, 3, _, 5, 6, _, 8, 9].
    [5, 9, 7, 3, 8, _, 6, 1, _].
    [6, 8, _, 1, _, 9, 3, _, 5].
    [9, 5, 6, _, 3, 1, 8, _, 7].
    [_, 3, 1, 5, _, 8, 9, 6, _].
    [2, _, 8, 9, 6, _, 1, 5, 3].
    [8, _, 9, 6, _, 5, _, 3, 1].
    [_, 6, 5, _, 1, 3, 2, 9, 8].
    [3, 1, _, 8, 9, _, 5, _, 6].
This shows that CLP(B), in contrast to CLP(FD), has determined the unique solution without any search. This is an example of a solver that propagates as strongly as possible, and hence achieves global consistency even across different constraints.

It also works for puzzle (a):

    ?- knuth(a, Rows),
       sudoku(Rows),
       maplist(portray_clause, Rows).
    [1, 2, 3, 4, 5, 6, 7, 8, 9].
    [5, 9, 8, 7, 2, 1, 4, 3, 6].
    [7, 6, 4, 3, 8, 9, 2, 5, 1].
    [8, 4, 2, 1, 9, 3, 6, 7, 5].
    [9, 7, 5, 2, 6, 4, 8, 1, 3].
    [6, 3, 1, 8, 7, 5, 9, 4, 2].
    [3, 1, 6, 9, 4, 8, 5, 2, 7].
    [4, 5, 7, 6, 3, 2, 1, 9, 8].
    [2, 8, 9, 5, 1, 7, 3, 6, 4].
At this point, you may wonder: Why have I started with (b) instead of (a)? The reason is easy to explain: For puzzle (a), the strong propagation of the CLP(B)-based formulation took so long that it only completed now, after I have finished typing this, a few minutes after I had posted the query, whereas it only took a few seconds for example (b).

Both examples have taken considerably longer to solve with CLP(B) instead of CLP(FD). That's not to say that one approach is better than the other in general or even in that particular case - instead, it is more an observation about the solver implementations and the chosen trade-offs between propagation strength and efficiency in both solvers.

CLP(B) achieves this strong consistency by internally considering, in a sense, the totality of all possible assignments, and this may require exponential space and hence also time. But for many important practical cases, the representation used by CLP(B) is very efficient, and indeed allows us to solve tough problems efficiently. Note that still, no search is involved here: Even the internal propagation of CLP(B) proceeds completely deterministically, and at no point in the computation do we have to guess anything.

It must also be said that the CLP(FD)-based formulation, in addition to being faster (despite the necessary search), is also shorter and arguably more natural than the Boolean variant. Casting this as an exact cover or hitting set task can definitely be done, either with CLP(B) or CLP(FD) constraints (since Booleans can also be considered as a special case of integers). However, it takes us even further away from the quite direct and readable "natural" CLP(FD) formulation. For cover problems in general, I recommend you obtain a CLP(B) solver that internally uses Zero Suppressed Binary Decision Diagrams (ZDDs). They let you efficiently represent cases where many of the variables (such as row selection indicators) are zero, which is typical for covering tasks.

Consider the Eight Queens problem. There are multiple valid solutions to the problem on a given chessboard. They are symmetric solutions, yet distinct. Any algorithm that finds a single solution and not all of them must be prioritizing arbitrarily among solutions, i.e., searching or fathoming them in some particular order.

> This solution is found via constraint propagation alone. This means that the system has deduced the unique solution purely by reasoning about the available constraints, which it does automatically every time a constraint is posted. This is one example of an algorithm that does not guess.

Have you considered how that constraint propagation algorithm works? What specific algorithm are you referring to? Constraint Programming is an alternative formulation of the same problem that Integer Linear Programming tackles, and the latter is known to be NP-hard.

https://en.wikipedia.org/wiki/Linear_programming#Integer_unk...

Constraint Programming and Integer Programming are both characterized as examples of Tree Search:

http://preserve.lehigh.edu/cgi/viewcontent.cgi?article=1780&...

> Since tree search is a basic solution technique employed in both constraint and integer programming, we begin with a generic overview oftree search as a technique for finding feasible solutions to mathematical models. [...]

> Every tree search algorithm is defined by four elements: a node- processing procedure, pruning rules, branching rules, and search strategy. The processing step is applied at every node of the search tree beginning with the root node, and usually involves some attempt to further reduce the feasible set associated with the node by applying logical rules.

> Overall a tree search algorithm works as follows. A list of nodes that are candidates to be processed is maintained throughout the algorithm. This list initially contains only the root node. At each step in the algorithm, a node is chosen and processed. If the processing results in pruning of the node by one of the pruning rules, then the node is discarded. Otherwise, the node is further divided, resulting in two or more children, which are then added to the list of candidates. This procedure is iterated until no nodes remain on the candidate list.

> [...] the constraint program is made up of two pieces: the constraints that comprise the formulation and a search strategy for solving the problem (emphasis mine). This is in contrast to a mathematical program, which describes a model but does not specify how to solve it.

> Traditionally, a constraint programming formulation consists only of a list of decision variables, domains, a set of constraints to be satisfied, and a search procedure (emphasis mine). This traditional definition refers to a constraint satisfaction problem (CSP), since a solution is any assignment of values to variables such that all constraints are satisfied.

Can you clarify which algorithm you're referring to, and how it manages to solve the problem without employing Tree Search?

One of the most important and characteristic aspects of constraint programming is alluded to in the following part of your quote (emphasis mine):

"involves some attempt to further reduce the feasible set associated with the node by applying logical rules"

These propagation rules are among the major attractions of CP, and distinguish the paradigm from uninformed search strategies that have to consider much larger portions of the search space in general. I think a description of CP should put at least equal emphasis on this pruning mechanism as on the actual search itself, because sufficiently strong propagation may render the search completely unnecessary. In fact, this is exactly what happens in the Sudoku example of the article!

In a sibling thread, sfrank has cited a very important reference as an example of such a propagation algorithm, which serves to illustrate this idea and is also widely used in practice in CP systems and their applications:

A filtering algorithm for constraints of difference in CSPs, J-C. Régin, 1994

To see this algorithm in action, consider the following example: Take three variables X, Y and Z, all of them integers that are either 0 or 1, with the additional constraint that they must be pairwise distinct. In Prolog with CLP, and said algorithm available under the name all_distinct/1, we can state this task as follows:

    ?- X in 0..1, Y in 0..1, Z in 0..1,
       all_distinct([X,Y,Z]).
In response to this query, the system says:

    false.
This means the system has deduced that there are no solutions. Note that this result is obtained without applying any search! For comparison, suppose we naively post pairwise disequalities instead:

    ?- X in 0..1, Y in 0..1, Z in 0..1,
       X #\= Y, Y #\= Z, X #\= Z.
In response, the system now answers:

    X in 0..1,
    X#\=Z,
    X#\=Y,
    Z in 0..1,
    Y#\=Z,
    Y in 0..1.
This means that there could potentially be solutions. The system does not know whether there are any, so it shows us remaining constraints that must hold for any solution. Now, an additional search makes clear that there are none:

    ?- X in 0..1, Y in 0..1, Z in 0..1,
       X #\= Y, Y #\= Z, X #\= Z,
       label([X,Y,Z]).
Here, I have simply added the goal label/1, which is the explicit enumeration and thus the search I have mentioned in an earlier post. In response, we now again of course get:

    false.
As another example of propagation, please consider:

    ?- X in 0..1, Y in 0..1, Z in 0..2,
       all_distinct([X,Y,Z]).
    Z = 2,
    X in 0..1,
    all_distinct([X, Y, 2]),
    Y in 0..1.
Here, the system has deduced that Z is necessarily 2, again without any search. For the other variables, both values are still admissible, and to obtain concrete solutions, we must again label them explicitly. However, we do know that there is a solution in this case, due to the strength of all_distinct/1, which internally uses a complex reasoning about the value graph of the CSP that is somewhat anticlimactically encapsulated in such a superficially trivial predicate, but propagates much more strongly than simply stating pairwise disequalities would.

This shows that you can arrive at a solution (or prove the lack of a solution) either via a search or by using a sufficiently strong propagation mechanism. Doing more of one requires less work of the other. Please note that, as I said before, you indeed need an explicit search in general, because the propagation algorithms alone are, while often quite effective, not effective enough in general to guarantee consistency when different constraints interact, even if they guarantee domain consistency individually. This is a practical trade-off between propagation strength and efficiency of the available constraints. You also need search to enumerate all solutions, if there are more than one. However, note that propagators are also triggered during the search, and so unifying even a single remaining variable with a concrete integer may again lead to a situation where no additional search is necessary. Thus, the initial size of the search space is not a satisfactory measure for the eventual complexity of the search, because it does not account for the additional propagation that is applied during the search itself.

Even in the case of Sudoku, and even if you use the most powerful individual all_distinct/1 constraints, you need to search, in general, to truly generate the unique solution explicitly. But the Prolog code shown in the article doesn't (hence, it will lead to floundering constraints in general), and the concrete puzzle shown in the article is, coincidentally or not, correctly solved just by such propagation algorithms, without any search. This situation is also not particularly rare: In many cases of practical importance, constraint propagation alone already tightens remaining domains so significantly that no or only very little additional search is necessary.

Thus, everything you said is true: Yes, you need search in general, also when using CP, to obtain concrete solutions. However, to repeat, it is quite different from what one would call "brute force" search because the sophisticated and often very effective propagation algorithms prune the search space, often substantially, and can moreover help to guide the search with various heuristics. In practice, you typically buy a Prolog system just to benefit from these propagation algorithms! Various notions of consistency exist, and you can find more such algorithms in the literature references that are for example included in SICStus Prolog and other Prolog systems with constraints.

Second, note that the description you quote does not even cover the particular case that arises in the article and also many other cases in practice: That is, there are no nodes at all, because everything is already settled by constraint propagation before any search even starts, making the whole solution explicitly available without search.

The Eight Queens problem you cite is quite different from Sudoku, because it may have multiple solutions. For this reason, as you correctly observe, search is needed to generate the different solutions in such cases. But this does not invalidate the Sudoku example, where we know that exactly one solution exists, and sufficiently strong pruning could always determine it, whether by internal propagation or other algorithms that simply eliminate all values that do not participate in the unique solution of the puzzle. Nor does it show that the cost for the search outweights that of constraint propagation in either case.

For these reasons, I recommend to put at least equal focus on constraint propagation as on the search when discussing CP.