| Sorry, I was in a hurry and I think I mixed up articles since I think another one recently appeared on HN which pointed to a page with a much better explanation. The main idea behind Pijul (as I understand it) is that it makes merging divergent branches correct and predictable by making patch application an associative operation.[1] What this means is that that, given patches A, B and C, it should be irrelevant whether you are: 1. starting with A, applying B on top, followed by applying C
2. starting with A, applying the result of C applied on top of B all at once In other words, (AB)C = A(BC)
This isn't always what happens in git because it doesn't work with patches on an abstract level. Instead, it always works with states of the branch heads (and sometimes the state of their branching point, i.e. BASE, in case of a 3-way merge).As to why this matters, [2] and [3] has practical test cases where the difference of these two approaches is observable. [3] is an example of plausible C code where the non-patch approach may produce an incorrect result. It also demonstrates the way the patch algebraic approach is able to take individual changes into account and apply edits from the second branch in the correct place in the first branch, even though the affected code has moved in the first branch in the meantime. The focus on patches has other important implications, such as the fact that branches then simply become sets of patches and cherry-picking retains the identity of patches instead of creating new, unique commits. The other important aspect of Pijul is use of efficient data structures that are naturally suited to the problem in order to avoid suboptimal algorithmic complexity. This is explained in more detail here[4]. [1]: https://pijul.org/manual/why_pijul.html [2]: https://tahoe-lafs.org/%7Ezooko/badmerge/simple.html [3]: https://tahoe-lafs.org/%7Ezooko/badmerge/concrete-good-seman... [4]: https://pijul.org/model/ |
Now some more in-depth bla bla if interesting:
> (AB)C = A(BC)
great feature request, I agree.
> This isn't always what happens in git
correct.
> because it doesn't work with patches on an abstract level. Instead, it always works with states
Incorrect though. States, patches, these are just trade-offs. You can represent either in the other completely. Like you can build a list using a tree structure if you just allow one branch. Or you can also build a tree on top of a list structure, if your traversal algorithm knows which item-index to pick for a certain subtree's children. All trade-offs.
That doesn't mean "Pijul does better diffs" would be wrong, though. It can still be the case. But it doesn't mean that git would need huge refactoring to also implement that better-diff-algorithm. In the end implementing this better algorithm in git might be trivial for a git core developer if you can explain to him how it works.
If you think about it a diff between two states has an unlimited way of being represented. And considering minimal steps to generate the diffs with adding lines to the diff and removing lines from the diff, the whole thing is an abstract tree. Basically to achieve associative patches one needs to make sure to always traverse this tree in the same order. Git traverses greedily though, using the very first diff that is good enough as a final result. Probably the idea behind this was also smart. Do it quickly for now, and optimize it if needed later.