| My recent horror from some git work was discovering how git sorts its tree objects. The docs just say to sort by C locale (byte-order sorting). Easy. Except git was sometimes rejecting my packfiles as being bogus per its fsck code, saying my trees were misordered. TURNS OUT THERE'S AN UNDOCUMENTED RULE: you need to append an implicit forward slash to directory tree entry names before you sort them. That forward slash is not encoded in the tree object, nor is the type of the entry. You just put the 20 byte SHA1 hash, which is to either a blob or a hash (or a commit for submodules). So you can have one directory with directory "testing" and file "testing.md" and it'll sort differently than a directory with two files "testing" and "testing.md". You can see a repro at https://gist.github.com/bradfitz/4751c58b07b57ff303cbfec3e39... (So to verify whether a tree object is formatted correctly, you need to have the blobs of all the entries in the tree, at least one level) |
The way I found out was that Github kept rejecting my push, because as I later discovered, my git history was invalid precisely due to entries being sorted improperly due to the forward slash requirement. I could have solved this with the real git, but the point was to use my tool exclusively for version control from inception, so I just deleted the .git folder. So, my git history appears to begin near the end of the whole cycle. But I did manage to learn a lot, both about git and about the language I implemented it in.