| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mxmlnkn 1021 days ago

I assume you wanted to link to TurboBench and not that particular issue, which for some reason also contains a link for some car listing?

Secondly, did you not see my answer to ebiggers under this comment you replied to? Yes, for Silesia, libdeflate is faster, I can confirm, but there are at least two cases for which igzip is faster and one for which igzip is twice as fast. But yes, it heavily depends on the input data.

Edit: I was then wondering why I could not find any igzip benchmarks on the repository's ReadMe and then found https://github.com/powturbo/TurboBench/issues/43 , so I guess this is the one you wanted to link to and the 3 got cut off.

2 comments

powturbo 1021 days ago

Sorry, a digit was missing, but now https://github.com/powturbo/TurboBench/issues/43

Well, a correct benchmarking is not done with special data, but with datasets that represent a large set of distributions. Such datasets are for ex. einwik8/9 for text, silesia for a mixed dataset. As a corner case example, RLE-compressible data is not representative for benchmarking compression libraries.

If you provide a link for a dataset 10-100MB, I can verify your claims, because I'm not aware of a dataset where igzip is 2 times faster than libdeflate. In TurboBench there is no I/O or other overhead involved, additionally it's single threaded. It's also possible that you're comparing two different CLI programs, one (igzip) I/O optimized and the other as a simple CLI.

link

powturbo 1020 days ago

EDIT: I've seen the file you're referencing is 4Gi-Base64. This file is not very compressible (75% with gzip). It's possible that igzip is simply storing the file or some parts of it without compression. This explain why it can be faster that libdeflate, because in this case igzip is using memcpy at decompression.

link

mxmlnkn 1020 days ago

That is not the case. I have compressed the file with gzip and then tested decompression with multiple gzip decompressors. I.e., they all are fed the same compressed binary. Furthermore, I have used rapidgzip --analyze, to print out all deflate blocks, their compression types, and other information and statistics. All of the blocks are Dynamic Huffman compressed blocks.

I have already incorporated the ISA-L Huffman decoder into rapidgzip but that did not give full ISA-L speed. Ergo, I think the last performance comes from the inflate loop (decode Huffman, decode distance code if necessary, resolve references, repeat). This part is written in Assembler and seems to do some kind of speculative prefetching, i.e., already get the next Huffman Code symbol assuming that the current symbol is a literal or something like that. It's quite interesting but I doubt, or rather, I already tried a bit and failed to reproduce this kind of prefetching inside the C++ code but failed to do so. The compiler is probably rearranging everything anyway.

rapidgzip --analyze 4GiB-base64.gz | tail -100

    == Benchmark Profile (Cumulative Times) ==

    readDynamicHuffmanCoding : 0.903727 s (3.38252 %)
    readData                 : 25.8139 s (96.6175 %)
    Dynamic Huffman Initialization in Detail:
        Read precode       : 0.00975829 s (1.07978 %)
        Create precode HC  : 0.0341786 s (3.78196 %)
        Apply precode HC   : 0.0667695 s (7.38823 %)
        Create distance HC : 0.114097 s (12.6252 %)
        Create literal HC  : 0.678924 s (75.1249 %)


    == Alphabet Statistics ==

    Precode  : 123274 duplicates out of 126748 (97.2591 %)
    Distance : 597 duplicates out of 126748 (0.471013 %)
    Literals : 13799 duplicates out of 126748 (10.887 %)

    == Precode Code Length Count Distribution ==

    16 |==================== (126748)

    == Distance Code Length Count Distribution ==

    30 |==================== (126748)

    == Literal Code Length Count Distribution ==

             259 |=============        (50635)
    2.601250e+02 |==================== (74281)
                 |                     (1809)
             262 |                     (23)


    == Encoded Block Size Distribution ==

    185942 bits |                     (1)
                |                     
                |                     
                |                     
                |                     
                |                     
                |                     (2)
    207391 bits |==================== (126745)


    == Decoded Block Size Distribution ==

    30579 Bytes |                     (1)
                |                     
                |                     
                |                     
                |                     
                |                     
                |                     (1)
    34112 Bytes |==================== (126746)


    == Compression Ratio Distribution ==

    1.314967e+00 Bytes |                     (9)
                       |                     (1327)
                       |======               (17906)
    1.315808e+00 Bytes |==================== (52698)
                       |================     (42890)
                       |====                 (10876)
                       |                     (1001)
    1.316891e+00 Bytes |                     (41)

    == Deflate Block Compression Types ==

    Dynamic Huffman : 126748

link

powturbo 1019 days ago

Thank for your analyse. It seems it's a corner case, benchmarking a base64 encoded random file is not a typical case. igzip has no more advantage above libdeflate. It has only fast compression at the levels 0,1,2 but with mediocre compression ratio.

link