|
|
|
|
|
by mxmlnkn
1027 days ago
|
|
My extended benchmarks [0] use the rapidgzip, igzip, gzip, and pigz command line utilities and simply redirect the output to /dev/null to minimize I/O write interference. That's where I got the comparison of igzip to "zlib" as it is used in pigz. I did not compare libdeflate very often to these benchmarks, but before posting my comment, I quickly ran "time libdeflate-gzip -f -k -d 4GiB-base64.gz" inside /dev/shm, which took 20s vs. 9s for igzip. libdeflate-gzip is something I built and installed a while ago from libdeflate/programs/gzip.c. lideflate-gzip -V prints: "gzip compression program v1.18. Copyright 2016 Eric Biggers". I am aware that lots of care also has to be taken with I/O, which might make the command line utility slower than the library interface, but doing the tests in /dev/shm hopefully alleviated this. I am also aware that base64-encoded random data is a weird test case but it has its pros because it is a kind of minimal benchmark for raw Huffman decoding speed without (many) LZ references that need to be resolved. I redid the benchmark as outlined above with the three test files that I am also using for my extended benchmarks [0]: 4GiB-base64.gz -> libdeflate: 20.5 s, igzip: 9.4 s, rapidgzip: 1.5 s
20xsilesia.tar.gz -> libdeflate: 5.4 s, igzip: 6.6 s, rapidgzip: 1.8 s
10xSRR22403185_2.fastq.gz -> libdeflate: 5.8 s, igzip: 5.5 s, rapidgzip: 1.9 s
File Sizes: Compressed -> Uncompressed: 4GiB-base64.gz : 4294967296 -> 3263906203
20xsilesia.tar.gz : 1364776140 -> 4239155200
10xSRR22403185_2.fastq.gz : 970458140 -> 3618153020
In conclusion, it seems that it highly depends on the test case and the one I tested to, too quickly, check my statement is one of the outliers.[0] https://github.com/mxmlnkn/rapidgzip#scaling-benchmarks-on-2... |
|
[0] https://github.com/mxmlnkn/rapidgzip/blob/master/src/benchma...