Hacker News new | ask | show | jobs
by kthielen 697 days ago
I hear different opinions about whether type checking is "an" instance of abstract interpretation (AI) or "the" instance (which IMHO is a useful question since we might consolidate our efforts on one method if possible).

This article does a very good job of laying out typical examples motivating AI as a concept.

But the process of attributing points in the AI lattice to expressions does look a lot like type inference (and the AI lattice itself does look a lot like a type hierarchy).

Does this point to a slightly different approach we should take to type systems? For example, suppose the type of '1' was not 'int' but rather '1' (a type runtime equivalent to unit, but carrying its information at compile-time), and that 'add' is overloaded on input types so that it can either directly compute its result (also then statically knowable) or emit instructions to dynamically compute its result if its arguments are only dynamically known.

Is that the meta lesson here? Should we be using more detailed/nuanced type systems?

5 comments

If you're talking about type systems, you typically have one per language: a language has a certain type system. A different language might have a different one. If different implementations of the same language have different type systems, they tend to be called different dialects, if not different languages altogether.

But as noted in a sibling comment by 'chc4, one might want different AI domains for the same language, either all at the same time, or in different compiler passes, or only if the optimiser is enabled, etc. While AI might look like a type system, and perhaps the algorithms are sort-of similar, at the very least the two are used differently.

EDIT: There is prior art on doing type systems and further analyses simultaneously; these are sometimes called "type and effect systems": https://en.wikipedia.org/wiki/Effect_system Contrary to what wikipedia seems to claim, such systems need not actually talk about side-effects only; one can also formulate provenance analysis in a programming language (e.g. which lambda can end up in this function-typed variable?) as a type-and-effect system.

FWIW, the Patrick Cousot* perspective on abstract interpretation is that literally everything is an application of abstract interpretation. It's kind of a meme in the PL community at this point, but he's really all-in on it and even begins his course with a self-deprecating joke about this viewpoint.

*Patrick Cousot and his late wife, Radhia, are credited with inventing abstract interpretation.

my last job was working on [pytype](https://github.com/google/pytype), which uses abstract interpretation to do static type inference and checking for python. we used the cpython compiler to convert a program to bytecode, and then ran the bytecode through a VM where the abstract values were types. it worked extremely well, and could even typecheck completely unannotated code, though of course with less precision than when the user supplied some types.

extending that to runtime JIT compilation is an interesting idea; i'm not sure if any of the current JIT systems do that, but i don't see any reason it wouldn't be a useful technique, so they likely do.

What? There is no "the" instance of abstract interpretation, because different uses of it will have different abstract domains they are interpreting over and different transfer functions. There's no reason to choose types as being a more privileged implementant than odd/even, and choosing odd/even as being a "case" of a type doesn't make more sense than choosing types as being a "case" of odd/even.
In particular PyPy's abstract values have multiple domains at once: type, known bits, known range, etc. They do it all in one pass
no shade at you (i like you and your work and blog) but

> type, known bits, known range

these are all the same thing.

i was thinking about this question in the shower (before responding) and i'm pretty sure type inference (for whatever purpose, like deducing the type, or refining the type, etc) is the only use-case of abstract interpretation. if you look at couset's webpage (i haven't read the original papers) you see lots of cute pics of program trajectory but no one actually uses abstract interpretation for control-flow tracking - the lattice is only ever updated at a type boundary.

prove me wrong (like i could be wrong) but i'm reflecting on implementations in llvm/mlir/pytype ie mature/real use-cases (i have not poked around in pypy's impl).

They're related, but they're not the same thing: E.g. when analysing a JS-like language and you see "x = a + b" with raw memory content a = {0x1,0x2}, b = {0x3} you get x = {0x4,0x5,0x13,0x23} - because you can't tell if that + is a string concatenation or numerical addition.

Of course when analysing program code, you might want to carry type information with you. The math behind AI allows to combine basically any domain to a single domain without losing all the nice properties of the Galois connection. But usually you get away with tracking multiple domains simultaneously, since that's also a lot easier from an engineering perspective (someone has to understand and maintain that code for a few decades).

Edit: I recommend following funcDropShadow's advice.

I'll leave this link here for you: https://www.absint.com/astree/index.htm

If you are not convinced think about the name of the company during your next shower.

I'm not sure what you mean that they're the same thing. They're different domains with different transfer functions. There's even a recent ish paper exploring how they can be used to inform one another. Can you say more?
> type, known bits, known range

>I'm not sure what you mean that they're the same thing

they're all components of the type; i have no idea what pypy does with "known bits" but deducing "range" of a value is firmly/unanimously as "type refinement" (see https://en.wikipedia.org/wiki/Refinement_type) i.e. still a part/aspect/dimension of the type not of the value.

I think with a suitably expansive definition of “type” you can get pretty far in this direction.

I have personally used AI for control flow analysis of a toy OO language.

Let me repeat what I think the question is in other words:

Is there any kind of abstract interpretation that can not be modeled as a type system?

that's not what the other question/poster/person is implying - every can already be modeled as a type system because type systems (at least some of them) are turing complete. the other question asking whether there's any other application of abstract interpretation other than type checking/type inference. to which i say no or at least none that i've ever seen ie no one ever does anything with it something about eg control flow. it's always only about types.

edit: i was about to say "for example no one ever uses abstract interpretation for escape analysis" and then i wanted to double check and lo and behold https://dl.acm.org/doi/abs/10.1145/945885.945886. so i guess i'm wrong (and the other original question can be answered in the negative) but i would still argue that it's true in practice. the reason is obvious - the space/lattice of possible types of a program is huge, the space of states is .......... busy beaver scale.

There’s forward and backward propagation. Backward propagation is generally less about the type of a variable, but it’s still a form of abstract interpretation. For example, is-live (or dead) or used analysis, or reachable analyses are not really type information, as they tend to depend on the control flow after a point.

There’s some other analysis that’s not type-like as such, for example available expressions.

You might as well express escape analysis as inferring an "escaping reference" or "non-escaping reference" type for each allocation site. Then you're back to "abstract interpretation is type inference". But wanting to make this statement true also makes it meaningless. Tautological-by-definition statements aren't interesting.
Abstract interpretation can do constant folding and branch elision. It changes the program instance. Type checking is about invariants over a specific program, not changing the program itself.
Most languages have type polymorphism that do change the program itself. But I agree, neither of those examples are instances of it.
> "the" instance

i suppose you can probably do anything with dependent types, but i'm not sure this is a useful perspective. i commend you to read my comments on the red website https://lobste.rs/s/xkcrvn/

(i do think it is a valid question whether abstract interpretation is a good idea)