Hacker News new | ask | show | jobs
by stereosteve 1831 days ago
the problem happens with any fixed window spacing regardless of the block size.

If you create a block every Xmb... inserting a single byte at the beginning of the file will change every subsequent block.

1 comments

You're technically speaking wrong, but I'm sure the author doesn't want to reimplement block storage devices... So the spirit of the message is probably correct
Oh I'm not talking about disks... this is based on how SnowFS (the library for this project) splits up big files into chunks:

https://github.com/Snowtrack/SnowFS/blob/main/src/common.ts#...

The intent is a simple form of delta encoding, the hope is that many chunks will be common between two versions.

I should clarify this. The 100 MB window in SnowFS is currently unrelated to compression as it is only used to compare if a block changed. Each block gets a hash. This is a fallback used for some file formats where the mtime timestamp cannot be trusted. Some files have a change in the first block e.g. 100 MB and that is faster to compare than an entire 8GB file. But this window size is dynamic and can be changed and used for compression in the future
Ahh this is my bad. For some reason I assumed the blocks were part of the storage scheme, but I see they only are used to compute hash, and that the whole file is added to zip. Sorry for the misunderstanding!