Hacker News new | ask | show | jobs
Solving SAT via Positive Supercompilation (hirrolot.github.io)
153 points by Hirrolot 866 days ago
5 comments

There are many very simple heuristics for solving 3-SAT, some even with analytical convergence estimates. A simple random walk (Algorithm P in Knuth's Satisfiability fascicle), which iterates through the clauses and flips a variable at random if it encounters a violated clause, will do the trick too
Conversion of an NFA graph to DFA (in regex processing) seems to be a kind of supercompilation. The NFA is treated as a process, and is simulated for all possible input symbols at every step. The supercompiler makes note of what states it goes into and turns sets of them into DFA states.
Without backreferences, it has a worst-case exponential blow up due to the DFA possibly needing to simulate that many simultaneous NFA states.

Regex matching with backreferences is NP-hard.

- https://perl.plover.com/NPC/NPC-3SAT.html

- https://perl.plover.com/NPC/NPC-3COL.html

Ah, should've double checked before posting. Thank you everyone who corrected me. Leaving it up for others.

> SAT is an NP-complete problem, meaning that it is at least as hard as any other problem in NP

I think this statement is backwards. Instead, any other problem in NP is at least as hard as SAT

NP-complete means that a problem is both NP-hard (it's as hard as anything else in NP) and is in NP (with all that membership in NP implies: there is a polynomial time witness and verifier for a "yes" answer, it's at most polynomially harder than any NP hard problem, etc.).
Nope. For example, binary search is in NP (because it is in P), but is much easier than SAT.
To be slightly nitpicky, it is problems that belong to complexity classes and not algorithms. Binary search is an algorithm that solves searching in a sorted list.

Funnily, although we usually think of it as having complexity O(log n), that does not hold true for the Turing machine model, with which the complexity classes P and NP are defined.

Join the rest of us in post-1950's complexity theory. We can use register machines now. Or you can bust out the big bucks and buy a RASP.
Wait:

` binary search is in NP (because it is in P),`

P = NP!? Casually proved in a HN comment?

It's well known that P is a subset of NP.
If you're a fellow dropout like me, context:

"P" is the class of problems that can be solved by a deterministic Turing machine in polynomial time. "NP" stands for "nondeterministic polynomial time," and is the class of problems for which a solution can be _verified_ in polynomial time by a deterministic Turing machine, given the correct "hint" or "certificate." Every problem in P is also in NP, since if a problem can be solved in polynomial time, its solution can certainly be verified in polynomial time (just solve the problem again).

> (just solve the problem again).

Certainly P is a subset of NP, but is this statement always true?

Consider a Galton board where a marble is dropped into an array of pegs. At each peg, it can go left or right at random (assume it never bounces further). After some number of choices, it falls into a slot.

Finding a solution (a slot that a marble can fall into) is easy: drop one marble. But it could take many marbles to verify that a marble can fall into a given slot.

As described, that is easy: all left choices, all right choices, everything in between. But imagine that the choices aren't fair, directly observable, or linear (LR != RL). Then it gets tricky.

I've never thought about non-deterministic problems of that sort in the context of computational theory, but it isn't uncommon in nature.

So we're already halfway done with the proof.
> So we're already halfway done with the proof.

Indeed the Cook-Levin theorem

> https://en.wikipedia.org/wiki/Cook%E2%80%93Levin_theorem

which actually proves that there exist NP-complete problems in NP is not a trivial result (and was at that time a highly surprising one).

So, indeed, "one half of the proof" has been done; the other half (which seems to be much harder) of P vs NP is still open.

The parent is implying the inverse, that NP is contained within P.
I believe you're misreading the comment? It says "binary search is in NP (because it is in P)"... i.e. the fact that binary search is in P implies that it is in NP. Which is true because P is a subset of NP.
Thinking of this way helps me.

NP-complete is the intersection of sets NP and NP-hard , but not the part of NP that contains P.

Lots of NP-hard problems are not in NP is important.

I'm almost completely clueless with both 3-SAT and supercompilation, but I'm willing to bet that if you found a neat trick that solves 3-SAT problems easily, then either or both (a) there are classes and/or sizes of 3-SAT problems for which your solution breaks down badly (goes exponential), or (b) what you're doing reduces to a 3-SAT strategy that the people who enter 3-SAT contests[1] already knew about a long time ago.

[1] http://www.satcompetition.org/

The solution has exponential time and space complexity, which I noted in the blog post. It is only interesting for its simplicity and as a stress test for supercompilers.
Ah, I missed that. Makes perfect sense! By the way, if you're interested in the "this problem can blow up badly because it can be converted to 3-SAT", you might like https://research.swtch.com/version-sat
3-SAT is NP-Complete, so either (a) is correct or else P=NP.
Touché! I guess I was thinking that there are often sub-classes of NP-complete problems that are amenable to much more efficient algorithms, and it might be easy to accidentally not try any hard ones. Turns out the author fully acknowledged exponential explosion, which I would have known had I read Final Words section instead of skimming everything after the intro!
3-SAT is at least as hard as SAT, or SAT ≤p 3-SAT. We know this because we can translate arbitrary SAT instances into 3-SAT in polynomial time. 3-SAT is in NP because we can check a candidate solution in polynomial time by plugging in the values and evaluating.

More accurate than “an NP-complete subset of SAT” would be to say that SAT is polynomial-time reducible to 3-SAT or that we can solve SAT in terms of 3-SAT.

Combining both of these, each is reducible to the other. In some sense, they are the same problem. 3-SAT imposes just enough structure to force it into NP-complete. 2-SAT is in P.

sad that SAT competition still hasn't moved from 00s and still is on http

SAT conferences aren't on youtube either :(

I love that the favicon for the 2023 results is a Doge, lol: https://satcompetition.github.io/2023/results.html
Haha, awesome!
Is this just compiler optimized memoization? I mean this is great for parallelization as well but I thought this was standard in the functional programming paradigm. Maybe I am wrong.
As I recall, memorization usually involves associating the results of calling a pure function with the particular input values so the computation of the function is reduced to a lookup operation.

The calculation of the pure function is still used for any other combination of input values.

Supercompilation looks at the unevaluated symbolic recursive code and replaces the code in some cases with simpler, specialized code tuned to the actual input values, so does a memoizing not only of the result, but the code used to calculate the function to begin with.

There are currently no production-grade compilers that make use of supercompilation in the functional world. Memoization is about caching results of pure functions; supercompilation is about deep program transformation.
Any idea why there aren't? Not even to a small degree? To me, it looks like fairly straight-forward rewriting. Perhaps it's hard to tell how much is useful?

> deep program transformation

It doesn't have to be deep, does it? You can't transform the entire program, so there's an arbitrary cut-off anyway.

I think there are a few big problems. Performance is a big one, code bloat is another. You also need a very good termination heurestic or there will be edge cases where compilation time spins out of control.

Here is some discussion here on why a pretty serious for GHC stalled: https://www.reddit.com/r/haskell/comments/2s97d0/the_state_a...

Supercompilation acts as global program optimization. As far as I'm aware, there were no attempts to supercompile separate libraries.

In fact, supercompilation subsumes many common optimizations, such as function inlining, constant propagation, list fusion, etc. So we can say that it's actually used "to a small degree", although I wouldn't really call it "supercompilation" per se.

The reason that full-blown supercompilation is not used is that it's too unpredictable -- the direct consequence of being too smart. I've elaborated on it in the final section of the blog post.

I thought the SPARC part of Ada did that