Hacker News new | ask | show | jobs
by EFruit 1562 days ago
I wish the underlying theory to this whole class of software was more accessible. It seems like these logical systems can do such amazing things, but I'd be remiss to integrate any of them without some degree of understanding of what they're doing under the hood; what their limitations and failure modes are, etc. As it stands, it's all far too magical for me to trust.
6 comments

The basic algorithm is DPLL and the other stuff is mostly clever hacks added on. There is a critical-systems analogy for SAT that supposedly really does go pretty far. SAT problems are like water: below a certain temperature, water is frozen and understandable. By analogy, SAT problems with too many clauses per variable are generally unsatisfiable and it's not too hard to figure that out. Above that temperature, water is liquid and understandable: the corresponding SAT problems tend to be satisfiable. It's only right around the critical temperature that stuff is chaotic and unpredictable, and that's where the hard SAT instances are. When I say that the analogy goes pretty far, I mean that the phenomenon really does get studied using tools of statistical physics. But I don't know anything more about that at the moment.

https://yurichev.com/writings/SAT_SMT_by_example.pdf is a very good tutorial about using SAT solvers, though it doesn't say much about the theory.

The so-far-released portions of Knuth TAOCP vol 4 do discuss SAT solver theory.

What you just stated about the critical temperature is true for some classes of random instances. Early results on larger classes of random instances from this line of research do not survive scaling experiments. I don't have a reference one that but I recall that being said in one of the discussion rounds of the Beyond Satisfiability online seminary at the Simons Institute.
It's unfortunately a very rapidly evolving field and is difficult to "break into" as far as understanding internals. Though just understanding DLPP is a good start, but ultimately you need a lot of practice and experimenting on your own to get a good intuition for it.

For the record these are problems many "solver-aided" communities struggle with; I'd say the #1 problem that people have with this class of tools is prospective, interested users struggle with the issue of "how can I encode my problem into a SAT instance with reasonable efficiency." I know this because it's my #1 problem!

On the other hand, there are a lot of "straightforward" NP problems you can run into and get good solutions for with little effort though. Binpacking comes up surprisingly often and being able to just throw it at a solver and get a good, generic solution quickly is always nice (not that this is exclusive to SMT solvers, but it's an easy way to get your feet wet at least.)

Decision Procedures [1] is a very good (and quite accessible) overview of the algorithms used in modern sat/smt solvers.

[1]: https://www.decision-procedures.org/

It is indeed magical with largely unpredictable run times, especially once you venture into undecidable fragments (quantified formulas). There is a million heuristics that you can tune (and indeed there have been attempts of using ML for that; ML is mostly useless for the actual reasoning). Using quantifiers with SMT is a bit like programming in Prolog though much more non-deterministic.

However, for the simpler (just NP-complete) problems the run times are more predictable. If your problem has a streight-forward encoding into SMT, you should definitely try it. The common format also makes it relatively easy to swap tools like CVC and Z3.

Quantified SAT is decidable. Are you aware of any theories where expressions with quantors over finitely many choices are unsatisfiable but quantor free expressions are satisfiable over the same theory?
It'd be great if it were less opaque, yes. SMT is a young field (early oughts) with a lot of extremely complicated algorithms jamed together, and there's not a lot of explanations beyond scientific papers. Even then, lots of topic required to implement a solver are not published at all, you need to know experts and ask them directly.

The best way is of course to write a solver, but it's a tremendous amount of work to obtain a system that will most likely have bad performance, and where a single bug can invalidate the correctness of the whole program. Ask me how I know :-). People are working towards proof checking though, so that's at least going to help with trustability.

Surely you are not using neural networks ! They are quite magical too !