Hacker News new | ask | show | jobs
by cwzwarich 1401 days ago
Facebook Infer is now just an ordinary unsound static analysis tool where some of the original analyses are inspired by separation logic. It doesn't actually prove any facts about the program being analyzed using separation logic.
2 comments

It's a common trend in industry. Here is this brilliant logical tool for reasoning about programs, in this case Hoare triples' descendent separation logic. Now let's figure out how we can keep our programmers from having to learn how to use it!

A logical description of what a piece of code does is evidently too much to ask of a working programmer. Instead, the prevailing dream is to just write code and then have some other code figure out what the first code actually does. Surely the sufficiently smart compiler is almost at hand.

This used to frustrate me and I suppose to some minor extent it still does. However, most of my frustration was relieved by Dijkstra's distinction between program correctness and program pleasantness. With vanishingly rare exceptions, revealed preferences show that industry could not care less about correctness. After all, it's pleasantness that determines success in the marketplace. Plenty of critically incorrect code nevertheless makes its owners billions. Sure, maybe all your private medical data and credit history got leaked, but here have a voucher for $10 a month worth of dark web monitoring for a year. I don't like this model, but at least it's in some sense rational.

I think it's that most, or nearly every code, is actually not critical and failure is ok. I would argue that code for every part of facebook is actually not critical. Of course it's bad if there's an error or the feed of instagram is actually not correct, or you can't send a message right now. But it's not critical, facebook will survive and the cost of eliminating those bugs is just not worth it. After all, unit tests only reduce the possibility of failure to an acceptable level. Amazons cloud had outages and amazon survived. It just about economics. The cost of hiring the right developers, writing and maintaining the code etc.

Writing code is not about solving a problem perfectly but delivering solutions under economic constraints.

> Here is this brilliant logical tool for reasoning about programs, in this case Hoare triples' descendent separation logic. Now let's figure out how we can keep our programmers from having to learn how to use it!

That's a good description of the Rust borrow checker, which is intended to be a sound analysis but a lot more restrictive than full separation logic.

Yes I think we need less abstraction. Best illustrated with RPC, when we try to treat it the same as local call we lose so much information. We want to reason about latency and failure so we need our language to represent this information and allow us to apply our logic to it.
It, like so many things, is a balancing act. How much does the working programmer need to know about binaries, symbol tables, platform calling conventions etc etc to use a debugger well? The author of a debugger absolutely has to, the user less so except in specialized circumstances. Allowing users to get at the inside-baseball bits of a tool without disenfranchising people that are focused on other areas of work seems ideal to me. RPC is unreasonable in that it makes unfair assumptions about the nature of the network. Seems to me insisting that programmers learn a good chunk of separation logic before knowing if a tool that uses the same internally is unreasonable in a like way. But a tool that _allows_ users that get some utility from it to learn separation logic if they need, now that's a well-designed tool.

If you can't use a tool without being an expert in it, the only people that will use that tool are experts and the techniques die out as the experts do.

I agree. Experience has taught me to hesitate to speak in absolutes absent a hard proof, but until I see a counter-example I'll believe that all abstractions are more or less leaky. Even pure functional programs have the observable side effect of using CPU time.

It's a difficult problem that I believe has no globally optimal solution. However, I think there's fruitful research to be done in better recognizing what ought to be abstracted and what ought not to be in some domain specific sort of way. Informally we do this already. For example it's relatively common knowledge that one shouldn't leak username validity via timing attacks, so in that case the processing time cannot be abstracted away.

1. This "unsoundness" was the subject of extensive (endless) discussions on the Infer team (as you can imagine). This spurred some theory and there are some precise theorems associated with how parts of Infer work. E.g., racerd has some theorems and over a period of a year saw no observed false negatives, so I'm not sure it's accurate to say that it's "just an ordinary unsound static analysis" (see https://dl.acm.org/doi/10.1145/3290370 and references).

2. Apart from that, to me, if logic inspires (part of) the design of an analyzer, that's perhaps more valuable than using logic to judge it after the fact, as analysis design is hard; and this is true even if analyzer ends up following part of the spirit if not the letter of the logic.

I believe I get where you care coming from, though, and appreciate your perspective. :)

Oh wow, this thread has attracted some notable participants. Much respect for your work. While you're here I've been wondering about applying all this stuff to the problem of having an unknown computing environment and bootstrapping a sane workspace on top of it. That might not make sense. I'm at a loss for how to explore the literature. Something like ubiquitous/pervasive computing with service discovery combined with foundational proof carrying code, certified compilers, verified stacks, etc.