Hacker News new | ask | show | jobs
by Hercuros 2500 days ago
I agree that verifying programs using dependent types is impractical for large programs. I think that the reasons are mostly practical rather than theoretical, however. There's too little benefit for too great a cost if you want essentially 100% correctness guarantees for every single part of a large, existing program.

Humans regularly do "outperform Turing machines." Most of the complicated mathematical proofs that have been produced by humans, cannot currently be automatically produced using an algorithm in reasonable time. Pick almost any non-trivial mathematical theorem, and you would be hard pressed to write a program that could prove it in a reasonable amount of time given just a list of the axioms. In theory, humans might not be able to outperform TMs, but in practice they regularly do, as attested to by the large amounts of mathematical proofs that we have produced by hand.

Model checkers also outperform humans for some tasks, but then again this is because human intuition has been used to write the model checker in the first place. My point is that using theoretical computer science results to talk about the difficulty of finding mathematical proofs in real-world cases is a bit of a non-sequitur. Yes, the Linux kernel would be practically impossible to verify completely using semi-automated tools, because it would take too much time and effort. But is this because of some theoretical result about the model checking problem? I don't think so.

One difference between having a model checker call a proof assistant and using a proof assistant to implement a verified model checker is that if you write a verified model checker implementation, you can be almost 100% certain that it's bug-free, because it creates proof terms that can be independently checked by the (ideally small and simple) proof kernel of the proof assistant. You don't have to trust that the model checker has been implemented correctly. This difference does not matter that much in practice, however, since model checker implementations probably receive a lot of testing and are quite trustworthy.

You're right that I am mostly talking about Coq, and that a different style tends to be used in Haskell/Idris. Note that dependent Haskell is actually logically inconsistent for backwards-compatibility reasons anyway, so there's not much to lose there by cutting a few corners for pragmatic reasons. It is not necessary to verify every single part of the program. Dependent types in Haskell gives you the freedom to do more. You can choose not to use that freedom where it doesn't make sense.

1 comments

> Most of the complicated mathematical proofs that have been produced by humans, cannot currently be automatically produced using an algorithm in reasonable time.

That's true, but humans have not done that in a time frame that's even remotely reasonable for verifying programs. Automatic verifcation methods outperform humans by orders of magnitude, on average, as far as program verification is concerned. My point is that if it is true that real-world programs are far from the worst case -- and note that we do not currently have much evidence to support that belief, because we have not been able to regualrly and affordably verify programs with deductive proofs -- then automated tools will be able to exploit that as well (once we know what "that" is, assuming "that" exists at all).

> My point is that using theoretical computer science results to talk about the difficulty of finding mathematical proofs in real-world cases is a bit of a non-sequitur.

Why? That's the entire point of the entire field of computation complexity. But don't think I'm pessimistic. I just think software verification is an extremely complex subject, and anyone who says they can certainly tell now which approach will "win" simply does not know what they're talking about.

As Fred Brooks wrote, realizing the limitations and understanding the magnitude of the challenge is not pessimistic. Physicists don't feel that the speed of light is a non-sequitur. Quite the opposite: understanding the limitations and challenges allows us to focus our efforts where they'll do the most good, rather than bang our heads against the wall just because that's what we happen to know how to do.

> Dependent types in Haskell gives you the freedom to do more. You can choose not to use that freedom where it doesn't make sense.

Given that only 0.01% of programmers use Haskell in production I don't really care what approach Haskell uses. But as someone who takes verification seriously, I think contract systems give you the freedom to use deductive proofs -- just as dependent types do -- as well as other methods, which currently have proven much more practical and useful in the real world.

What I meant by non-sequitur is that I agree with your premises (that certain problems are computationally intractable or undecidable, and this is good to know), and I agree with your conclusion (full verification using deductive methods/dependent types is unfeasible for large software projects). However, I do not agree that the conclusion is true because of the premises.

That is, I don't think one can say that the reason why human-aided verification is unfeasible for some projects (say, the entire Linux kernel) is because of theoretical limitations. Yes, verifying the Linux kernel would take too much time, but not because of any specific theorem in theoretical computer science.

The reason why I am emphasizing this point is that I believe that referring to theoretical results in discussions like this makes it seems as though there is some kind of fundamental limitation that prevents people from verifying the Linux kernel in a proof assistant. Even though it is quite a jump to conclude that from theoretical results, especially when human intuition is involved.

I would say the reason that people haven't verified a program the size of the Linux kernel is that it's just too much work. Just like if I were to try to write the Linux kernel from scratch on my own, it would take me years and years. But not because of a mathematical theorem. Some things are just a lot of work without there being a deep reason behind it.

We can go much further with deductive verification combined with automated verification compared to automated verification alone, precisely because human intuition can be used to guide the automated tools.

EDIT: To give another example, I would have no problem with the following reasoning:

A SAT solver that is efficient for arbitrary SAT problems cannot be implemented, because SAT is NP-complete (assuming P =/= NP).

However, I would have a problem with the following line of reasoning:

Intel CPUs cannot be fully verified by people because SAT is NP-complete.

> That is, I don't think one can say that the reason why human-aided verification is unfeasible for some projects

That's not exactly what I said, or at least not what I think. I think that full, 100% confidence verification (the method is irrelevant) has some serious challenges as suggested by those theorems, and that it is, therefore, unjustified to believe that there would be a clear solution any time soon. One way to circumvent those challenges is, for example, to reduce the level of confidence.

> Yes, verifying the Linux kernel would take too much time, but not because of any specific theorem in theoretical computer science.

I'm not saying it definitely is, but I don't think anyone can say it isn't, either (well, technically, a theorem won't likely apply to a specific piece of code, except as a sample from some distribution).

> The reason why I am emphasizing this point is that I believe that referring to theoretical results in discussions like this makes it seems as though there is some kind of fundamental limitation that prevents people from verifying the Linux kernel in a proof assistant.

I think that the fundamental limitation (about the general case) suggests great challenges. And again, the specific method doesn't matter as long as it's sound. A proof assistant and a model checker are both bound by those results, as both are sound.

> Even though it is quite a jump to conclude that from theoretical results, especially when human intuition is involved.

I'm not sure human intuition is very relevant. We've had human intuition and proof assistants for the past 40 years, and still we haven't been able to prove software beyond a very small size. So I think that at the very least we should consider the possibility that if we can jump two feet up, then even though it seems like jumping all the way to the moon is a matter of degree, there just could be some fundamental obstacles in the way. I'm not saying we can't overcome them, but I'm saying we can't assume they're not there.

> But not because of a mathematical theorem. Some things are just a lot of work without there being a deep reason behind it.

OK, but people have written the Linux kernel, and that's a lot of work, too. Don't you think that if a mathematical theorem says that, in the worst case, A is much harder than B, and that in practice, A is, indeed, much harder than B, then maybe theory has something to do with it?

> We can go much further with deductive verification combined with automated verification compared to automated verification alone, precisely because human intuition can be used to guide the automated tools.

I think that's speculation and that it's unjustified by either theory or practice. It's unjustified by theory because if human intuition can consistently prove successful (to the point of being an industry technique), then whatever method is used suggests some recognized easier-than-worst-case subset, and so automated tools could take advantage of that, too.

It's unjustified by practice because you could have said the same thing thirty years ago, and here we are now, and automated tools are doing better on average (it's kind of complicated to compare, because there are cases where either one is much better than the other, but overall, when we look at "verification throughput", automated methods verify more software more cheaply). It is true that automated tools have some limitations that we haven't been able to break through, but so do manual proofs.

You could indeed make the case that the intractability/undecidability-type theorems suggest that the problem of program verification is inherently difficult, and not something that can be solved in most cases using a trick. It requires some deep thinking in many cases, the same kind of thinking required to find complex mathematical proofs.

Currently humans are much better at proofs that require this kind of deep thinking. Automated tools are better for proofs that are based on a simple structure, like a large enumeration of cases (as in SAT solving and model checking, oversimplifying things a bit). This is the very reason they are automated: the reasoning performed by them is so simple/well-structured that we are able to write a program to do it for us.

However, many proofs fall outside of this easy-to-automate subset, and require human assistance. Human intuition does consistently do better than automated tools, but there is no single easily recognizable "trick" or "method" to it, which makes it hard to implement in a tool.

In my view, the reason that theoretical results do not apply very directly to human reasoning is precisely that human reasoning is "intuitive" and hard to capture in a program. We are not really Turing machines in any practical sense. We are more adept at proving things (perhaps creating automated tools to help us along the way) than any known algorithm.

I absolutely agree that automated tools make much more sense for your average industry application, but if you want to be able to use the full power of human intuition, instead of restricting yourself to program properties that are easy to prove automatically, then there is no way around manual proofs.

> Human intuition does consistently do better than automated tools, but there is no single easily recognizable "trick" or "method" to it, which makes it hard to implement in a tool.

Except in software verification, the facts on the ground are that it is the automated tools that do better than humans. We're not talking hypotheticals or speculation here. If you've written software proofs you won't find this particularly surprising. They're tedious and frustrating, but rarely require much ingenuity. Attention to detail plays a far more important role [1]. They're not at all like proving mathematical theorems. Software proofs are to math proofs what painting a hospital is to painting the Sistine Chapel.

> In my view, the reason that theoretical results do not apply very directly to human reasoning is precisely that human reasoning is "intuitive" and hard to capture in a program.

Perhaps, but so far humans do not seem to outperform the expectations. So you're counting on a human ability that, to date, we've not seen. In fact, humans rarely if ever beat computational complexity expectations. SAT solvers do much better than humans at supposedly-NP-complete problems, for example, and so do various approximation algorithms. Why doesn't our intuition help us outperform algorithms there?

And again, if intuition were the decisive factor, how come we're doing such a terrible job? If algorithms beat us today, what kind of boost in intuition do you expect that will allow us to beat them tomorrow?

Look, formal proofs are great. They're clearly the gold standard. So I encourage everyone here who thinks proofs are the future of software correctness to try them out to write correct large and complex programs. The problem is that they'll fail, and either abandon formal methods altogether, or worse, waste their time proving what they can at the expense of verifying what they should.

> but if you want to be able to use the full power of human intuition, instead of restricting yourself to program properties that are easy to prove automatically, then there is no way around manual proofs.

Maybe. It's certainly possible that one day we'll figure out how to make manual proofs work reasonably well for software verification. It's also possible that by then automated tools will get even better, especially considering that, as they've shown greater promise over the past few decades, that's where most of the research is focused. Sure, there will always be some niches where tedious manual work is required, but I see no indication that that niche would grow beyond the size it occupies now, an important yet small part of formal verification.

[1]: https://brooker.co.za/blog/2014/03/08/model-checking.html