Hacker News new | ask | show | jobs
by wscott 1521 days ago
BitKeeper used a seekable compressed file format for revision control data. It allowed a data-structure to be dynamically loaded on demand without needed to uncompress the whole file. A large empty memory buffer was allocated and then read permission was removed with mprotect(). Then a signal handler populate regions of that buffer with data from the compressed file on demand using the ability to seek to certain boundaries.

This change achieved a 10X speedup on normal operations compared to the old code that used SCCS files.

The compressed format stored data blocks in arbitrary order and then the index at the end of the file gave the data layout. Then allows write to the file without rewriting. BitKeeper itself only needed to append to the end of the file, but the format could support inserting data in the middle by only appending to the physical file.

It also had a data redundancy CRC layer that could detect data corruption and recover data from some types of corruption.

https://www.bitkeeper.org/ https://github.com/bitkeeper-scm/bitkeeper/blob/master/src/l... https://github.com/bitkeeper-scm/bitkeeper/blob/994fb651a404...