| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Banana699 1445 days ago

I think dynamic analysis is incredibly powerful and criminally underused in IDEs and other dev tools.

I have thought of an idea about 6 months ago that has been fermenting in my mind since then : what if (e.g.) a Python VM had a mode where it records all type info of all identifiers as it executed the code and persisted this info into a standard format, later when you open a .py file in an IDE, all type info of the objects defined or named in the file are pulled from the persistent record and crunched by the IDE and used to present a top-notch dev experience?

The traditional achilles's heel of static analysis and type systems is Turing Completeness, traditional answers range from trying and giving up (Java's Object or Kotlin's Any?), not bothering to try in the first place (Ruby, Python, etc...), very cleverly restricting what you can say so that you never or rarely run into the embarrassing problems (Haskell, Rust,...), and whatever the fuck C++'s type system is. The type-profiling approach suggests another answer entirely : what if we just execute the damn thing without regard for types, like we already do now for Python and the like, but record everything that happens so that later static analysis can do a whole ton of things it can't do from program text alone. You can have Turing-Complete types that way, you just can't have them immediately (as soon as you write the code) or completely (as there are always execution paths that aren't visited, which can change types of things you think you know, e.g. x = 1 ; if VERY_SPECIFIC_RARE_CONDITION : x = "Hello" ).

You can have incredibly specific and fine-grained types, like "Dict[String->int] WHERE 'foo' in Dict and Dict['bar'] == 42", which is peculiar subset of all string-int dictionaries that satisfy the WHERE clause. All of this would be "profiled" automatically from the runtime, you're already executing the code for free anyway. Essentially, type- checking and inference becomes a never-halting computation amortized over all executions of a program, producing incremental results along the way.

I have ahead of me some opportunity to at least have a go at this idea, but I'm not completely free to pursue it (others can veto the whole thing) and I'm not sure I have all the angles or the prerequisite knowledge necessary to dive in and make something that matters. If anyone of the good folks at JetBrains or VisualStudio or similar orgs are reading this : please steal this idea and make it far better than I can, or at least pass it to others if you don't have the time.

11 comments

RyanCavanaugh 1445 days ago

This is how JavaScript intellisense in Visual Studio used to work, except that the program was executed "behind the scenes" using a trimmed-down VM that could execute without side effects or infinite loops. It was eventually abandoned due to poor performance, predictability, and stability.

The problem is this dilemma: If you have to wait for a "real" execution of a program, then very reasonable expectations like "I can see a local variable I just declared" doesn't work. If you try to fake-execute a program, you have problems like trying to figure out what to do with side-effecting calls, loops, and other control flow problems.

Trying to reconcile a previous type snapshot with an arbitrary set of program edits was tried by an early version of TypeScript and wholly abandoned because it's extremely difficult to get right, and any wrongness quickly propagates and corrupts the entire state. The flow team is still trying this approach and is having a very hard time with it, from what I can tell.