Hacker News new | ask | show | jobs
by tikhonj 4770 days ago
I was working on a very similar project myself recently [1], heavily inspired by ydiff. Unfortunately, I didn't implement the tree diff algorithm entirely correctly, and it's been on hiatus for over a year :(.

The one interesting addition my project had was merging. The neat trick was that we reused the same tree diff algorithm to find conflicts :P. With a bit of work, we would have some very neat features, including the ability to resolve certain conflicts which physically overlap.

[1]: http://jelv.is/cow

1 comments

I thought of a variant of this approach for distributed code storage/reuse. Store each function as AST with normalized variable names. Then, a cryptographic hash of this would uniquely identify algorithm. So, caching becomes quite easy and very scalable. Callers refer to specific version by hash also. Thus your entire program can be identified by a single hash. Nice to distribute and cache at web scale. What do you think ?
It's what we need. This became clear shortly after learning git--it's the right data structure for the problem.

The hurdle is that hashes aren't user-friendly in a text-based code editor. We need an editor that lets us view and work at the right level of abstraction.