| Really awesome work :-) I've implemented the Fast Match / Simple Edit script algorithm almost 10 years ago for my Master's thesis[1] for my database project[1][2] in order to import revisions of files with a hopefully minimal edit number of edit operations between the stored revision and a new one (back then it was for XML databases). The diffing was only one aspect for the visual analytics approach to compare the revisions (tree structures) visually [4]. Internally the nodes are addressed through dense, ascending 64bit ints stored in a special trie index. Furthermore, during the import optionally changes are tracked as well as a rolling hash is stored for each node optionally. After the import you can query the changes or execute time travel queries easily. Technically, a tree of tries is mapped to an append-only data file using a persistent data structure (in the functional sense), COW with path copying and a novel sliding snapshot algorithm for the leaf data pages itself. I always have the vision to implement different visualizations to compare the revisions in a web frontend, but I'm currently spending my time on improving the latency of both writes and reads. Thus, if someone would like to help, that would be awesome :-) Kind regards Johannes [1] https://github.com/JohannesLichtenberger/master-thesis/blob/... [2] https://github.com/sirixdb/sirix [3] https://github.com/sirixdb/sirix/tree/master/bundles/sirix-c... [4] https://youtube.com/watch?v=l9CXXBkl5vI |