| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by martincmartin 3475 days ago
	This is the one paper I know of: http://research.microsoft.com/en-us/um/people/lamport/tla/am... I know of some small scale use at Facebook (where I work), where it has been used to verify a non-trivial protocol under various combinations of failures.

2 comments

technion 3474 days ago

I've been referred to this paper before when asking this question.

Whilst it does discuss that Amazon used TLA+ in nontrivial circumstances, it's still a long mental leap to me to get to "TLA+ for AWS S3 looked like this, and this is the sort of bug it helped quash".

It'd be great to see some of that sort of thing.

link

pron 3474 days ago

If you want to see what TLA+ specifications of distributed systems look like, there's this lecture series, each on a different distributed algorithm: https://github.com/tlaplus/DrTLAPlus

The kind of bugs you find when verifying distributed systems is, if this machine was in that state and that message was in transit, and then the other machine failed, then data was lost".

Basically, because formal verification of this kind ensures the specification is correct, it will find any kind of bug there is. But it's those bugs that are either a result of a severe flaw in the design, or bugs that would be hard to find in testing but their results may be serious, that make this worthwhile.

link

nonsens3 3475 days ago

Thanks. This is also the paper I was referring to.

link