Hacker News new | ask | show | jobs
by AnimalMuppet 818 days ago
First, did you (or anyone) write up the results from your measurement? That sounds like empirical data on a subject where I have never heard of their being data, so it would be really useful to capture it.

Second:

> The net effect was that we might have had 0 code bugs via this automatic proving system but the number of bugs in the specification actually went up.

Are you saying that this is part of what you measured? Or are you merely saying that this is hypothetically a way things could work out?

2 comments

>Are you saying that this is part of what you measured? Or are you merely saying that this is hypothetically a way things could work out?

I'm saying that, anecdotally, when I tried it I had more bugs thanks to formal verification because bugs crept into the spec. It was very hard to tell that those bugs were present because the spec was very, very complicated.

I'm not denying that it can catch bugs at all, just that a successful "proof" didn't mean a lack of bugs and that I personally found it to be a relatively inefficient method of catching bugs (or demonstrating their nonexistence).

>That sounds like empirical data on a subject where I have never heard of their being data

c.f.

https://userweb.cs.txstate.edu/~rp31/papers/KingHammondChapm...

Formal proof of correctness vs. manually created tests.

The comparison should be formal proof of correctness vs. fuzzing using the formal specification as a source of properties to be tested.

Fuzzing is a statistical technique that isn't ever going to give you a reassurance that a problem doesn't exist. It's great at giving you counterexamples, so fuzzing is great for discovering vulnerabilities, but unless you're fuzzing your program's entire state-space (which is absolutely impossible for even relatively small programs) then you're not comparing like with like.
>Fuzzing is a statistical technique that isn't ever going to give you a reassurance that a problem doesn't exist.

Formal verification doesn't prove that bugs don't exist either, thanks to the aforementioned "bugs in the spec" scenario.

So? The paper compared formal techniques vs. testing. Why is that suddenly not appropriate if the testing is fuzzing?