|
|
|
|
|
by brian_cloutier
3926 days ago
|
|
This is technical recruiting done right, a lucid walkthrough of a hard problem complete with links to the implementation. It must have taken at least a few weeks of engineer-time to write, Github is awesome for making this public. I wish they had talked a little more about the tradeoff they made. They mentioned that splitting packfiles by fork was space-prohibitive, but ended up with a solution which must take more space than originally used. (If the new heuristic refuses to use some objects as delta bases, some options which would have provided the best compression are no longer available to git) The performance win is incredible, how much space did they give up in the process? |
|
I didn't have any numbers on hand, so I just repacked our git/git network.git with and without the delta-aware code. A stock repack is about 300MB, and the delta-aware one is 400MB.
That sounds like a lot, but the real alternative is not sharing objects at all, which is more like 350GB.