Hacker News new | ask | show | jobs
by jsnathan 3213 days ago
> Short-answer is yes, and there's been lots of work done to go even further than having static typing and have formal verification for smart-contracts.

Yes, I've looked at some of these projects before. And I certainly think it's a good idea to use automated tools to try to prove properties about programs.

But I'm more excited about things like property testing than I am about things like strong type systems. And I'm wondering specifically what is the value they bring.

No offense, but all of the answers I've gotten so far are very vague, and don't really address my question. I don't doubt that strong type systems can catch bugs, I am wondering how their capabilities in catching bugs differ from test suites.

Let me give you an example, say we have a hypothetical language with a strict type system, and we declare a variable to be of type List[Foo]. Then later we use that variable as if it was really of type Foo. That's not gonna work, and a type-checker would catch that at compile-time. But a test suite (that covers the variable access) is going to catch that as well, because the code won't behave as it should.

At which point is a strong type system going to surface a bug that a good test suite would not have? Like, can you give an example?

> Not sure what you mean by "reasonable" (is it extensive? testing pathological cases? where do you draw the line?).

The line is as variable as the strictness of the type system we would compare it to.

I guess one could argue that a type system will force the programmer to satisfy it, while a test suite can be written very sloppily. So maybe there is some kind of signalling value in using these types of languages.

3 comments

> I am wondering how their capabilities in catching bugs differ from test suites.

Well, for one they can prove the absence of certain classes of bugs. Buffer overflows, for example. No amount of testing can do that.

Obviously most languages have "escape hatches" to do inherently 'unsafe' things like calling into C, but then at least you know exactly which bits to audit especially rigorously.

> But a test suite (that covers the variable access) is going to catch that as well, because the code won't behave as it should.

How many different List[X], where X != Foo do you need to test with to have the assurance you need? Are those tests that will actually get written? (IME it's pretty rare to see such "negative" tests, but then I mostly work in typed languages where such tests are usually unnecessary...)

There's also the really huge advantage to types that they actually document a machine-checked contract in a way that integrates seamlessly with the language. There's no such consistency in e.g. JS-land. Now, those contracts may be pretty vague (in e.g. Java or C#), but in Haskell for example they include such things as "does this function have any side effects?". That's extremely powerful, but it's hard to appreciate just how powerful until you have experience in those type systems.

EDIT: Also, don't forget that tests also have costs -- they have to be maintained just like the rest of the program, and static types can drastically cut down on the amount of tests you need to write+maintain.

> I don't doubt that strong type systems can catch bugs, I am wondering how their capabilities in catching bugs differ from test suites.

Type systems and test suites are complementary. When a test suite finds a bug, that proves that the program is incorrect for some inputs. When a type system doesn't reject a program, that proves that the program is correct for all inputs.

Which one to use depends on the impact of an error. In the case of a system that controls lots of money, you'll want a guarantee that all inputs lead to a correct balance. That suggests to use a type system.

On the other hand, if you're just writing an app to get data from a website and display it, you can probably afford it if the program doesn't work in some cases. If you can write a generator for realistic input, and check the output, that will give you a probabilistic estimate of correctness.

The main advantage that test suites have over type systems is the kind of properties they can easily check. If you have a test suite that only checks that the output values have the right structure (EDIT: https://news.ycombinator.com/item?id=15137691 points out that part of the Lisk test suite does exactly that), you'd probably benefit even from the C type system. But to formalize the correctness of values, for many programs you'd need a much more powerful type system e.g. using dependent types, that isn't quite so simple to use as writing the equivalent test.

I think a good compromise would be a language that allows you to annotate your types with arbitrary properties, but doesn't complain if it can't type-check them, so long as you write a test. (But it should complain when it can prove that the properties never hold, e.g. using success types, so that you don't waste time writing a test for that.)

> When a test suite finds a bug, that proves that the program is incorrect for some inputs. When a type system doesn't reject a program, that proves that the program is correct for all inputs.

> Which one to use depends on the impact of an error. In the case of a system that controls lots of money, you'll want a guarantee that all inputs lead to a correct balance. That suggests to use a type system.

When a test suite passes that proves that the program is correct for a subset of all inputs. But I do not see how the same can be said about a type system.

Types do not usually capture semantics - unless we are talking about something a lot more powerful than what I'm used to seeing in real programming languages.

> I think a good compromise would be a language that allows you to annotate your types with arbitrary properties, but doesn't complain if it can't type-check them, so long as you write a test.

But why do I need types for this? Why can't I just assert the properties outright, writing assertions only in terms of the code interface? It's not the types that tell me what arbitrary properties my code should have!

> Types do not usually capture semantics - unless we are talking about something a lot more powerful than what I'm used to seeing in real programming languages.

You are right, I should have specified that the correctness proof only applies to properties actually given in the type system. So for a high-stakes financial application, your types should be strong enough to capture at least elementary arithmetic. This is definitely not something you'd see in a mainstream programming language.

> It's not the types that tell me what arbitrary properties my code should have!

When you give a variable in a program a type, you assert a property for all values that variable will ever take. The converse is also true: for any property you want to express, there is a type system that can encode it. This is called the Curry-Howard correspondence.

Unfortunately, most interesting properties one might want to formalize require either an undecidable type system, or you have to write a bunch of proof code just to convince the type checker that the rest of the code conserves the properties it should. That isn't too different if you'd be doing formal verification for all your code anyway, but it gets annoying when it is enforced everywhere, even when you'd deem it unnecessary otherwise. In that, it is similar to a policy of "unit tests everywhere", which probably catches some bugs, but also leads to lots of boilerplate stating the obvious.

> > Yes, I've looked at some of these projects before. And I certainly think it's a good idea to use automated tools to try to prove properties about programs.

> But I'm more excited about things like property testing than I am about things like strong type systems. And I'm wondering specifically what is the value they bring.

By "property checking" you mean theorem proving.

Static type checkers are theorem provers. More powerful type systems allow more interesting and less intuitive proofs to be written. Sometimes, the type checker is integral to the compiler and is required to run on every build, and sometimes it's an external tool.

In any case, if you're adding annotations to your source code in order for a theorem prover to statically prove certain runtime properties, those annotations constitute a static type system which is type-checked by your theorem prover.

Maybe you don't like even a subset of your theorem prover to run on every build, but if you're in favor of machine-checked proofs of program behavior, you're in favor of static typing.

Property testing [1] is about semantics, the same as regular testing, and it does not require code annotations. [Edit: It's more like "fuzzing" than "model checking".] Type systems on the other hand aren't usually powerful enough to capture semantics.

They allow you to say things like, this function takes in a list of Foo objects, and returns one Foo object. But they don't really let you express whether the object returned is or is not part of the original list, and if it is, that it was chosen according to the right mechanism. That's what tests are for.

Without being able to express the semantics of code, I don't see how you can trust it. There may not be any type mismatches, but there sure can be lots of bad logic in there.

[1]: http://hypothesis.works/articles/what-is-property-based-test...