|
|
|
|
|
by stereosteve
1831 days ago
|
|
From a quick read of the SnowFS source code, it looks like it splits large files into 100Mb blocks and builds up a zip of blocks over time. A version of a file is an ordered list of hashes for the blocks in that version. I like the simplicity of this! But is it at all problematic if something changes early in the file and all the subsequent blocks boundaries shift causing many new blocks to be created? rsync uses a sliding window to handle this situation. The implementation would be more complicated, but have you considered using librsync internally? |
|
Let me explain where the 100 MB window comes from as its not only related to the upcoming compression implementation. Some graphic applications touch the timestamps of their files for no reason, making it harder to detect if a file changed. But some file formats always change their 'header' or 'footer'. Means, comparing the hash of the first or last 100 MB of a file that is 8GB in size gives a great performance boost to detect if a file got modified.