|
|
|
|
|
by lucb1e
1166 days ago
|
|
Would pigz's parallel compression prevent that failure case, at least limiting it to its block size of a default 128K? https://github.com/madler/pigz/blob/master/pigz.1#L43-L45 That would be a nice extra benefit, besides the speedup from being multithreaded. (I assume zstd also does multithreading but for those stuck with gzip, this is a drop-in replacement.) Edit: bzip2 apparently does the same, "bzip2 compresses files in blocks, usually 900 kbytes long. Each block is handled independently. If a media or transmission error causes a multi-block .bz2 file to become damaged, it may be possible to recover data from the undamaged blocks in the file." (--man bzip2) |
|
Other gzip variant formats like bgzip also make the chunks compressed in parallel completely independent. This results in ~3% worse compression ratio depending on the use case.
Note that another problem with bit flips and other errors in compression formats is that most decompression tool will simply quit on the first error even if the rest of the data could still be recovered.
Yes, bz2 is also more robust against errors because of the independent blocks.