Hacker News new | ask | show | jobs
by mlavrent 818 days ago
Sorry, I should've been clearer. I'm interested if there's any tool that does this kind of thing statically, without running the code. I guess a simple approach is to compile both programs and see if the generated code is the same, but I'd guess reasoning at the generated-code level will probably produce a lot more false positives (i.e. tool will report a change when there isn't one) than if you reason about the original program.
2 comments

This gets really hard, really fast. That is, yes, reasonably obviously doing this completely 100% accurately requires a solution to the halting problem, but even getting to "useful" is really really hard. Even the Haskell world doesn't try to solve the "equivalence of functions" problem, and it's even more complicated in imperative languages.

You probably have a mental image of catching something really simple, and, yeah, "1 + 1" -> "2" is reasonably easy, but in reality there aren't a lot of those super easy changes. Most of the time there is something confounding the situation.

Truly neutral refactorings are pretty uncommon in their own right. You can see that when someone is discussing semantic versioning and pointing out that if you define a "major version" as "there exists at least one possible use of the code whose behavior will be changed as a result of this library change", almost any API change is automatically a major version change, which isn't really what anyone wants. E.g., in Python, the mere fact that introspecting on an object's methods will show one more method than it used to isn't really what we want a major version change for. In general, proving refactorings are actually 100% safe is equally difficult; even simple arithmetic changes can result in things overflowing at different times or in different ways, it's virtually impossible to rewrite an expression involving floats without the change being witnessable somehow, extracting a function could make it so that code that previously didn't overflow the stack now does, memory allocation changes can be the difference between OOMing and not and may interact with GC in unpredictable ways if you get really precise, etc.

Here's a fun related article on Indistinguishability Obfuscation: https://cacm.acm.org/research/indistinguishability-obfuscati...

TL;DR verifying that 2 functions have the same output is really freaking hard.

The Unison language (https://www.unison-lang.org/) knows how to compute whether the semantic meaning of the code has changed (though I don't think it's possible to get the actual diff to visualize it).

You can edit a function you've committed into the Unison code repo, and if you didn't change the semantics of the function, it's actually stored under the exact same hash... All places using the function refer to it by its hash, so nothing needs to be recompiled either, and no tests need to be rerun.

Things like renaming variables, reordering code whose order doesn't matter (common in functional programming) and things like that do NOT change the hash.

I believe this is only possible because Unison is a Pure Functional Language. If it's not, it becomes a NP problem to decide if two programs are exactly equivalent, probably.

I wonder if Unison could provide the actual semantic diff you're thinking of, it's probably not much more complex than actually knowing the meaning of the code did change. Maybe create a Feature Request :) https://github.com/unisonweb/unison