| You're probably right in regards to compression ratios, and I also think that brotli would be a much better candidate. Proxies probably won't support it as "Transfer-Encoding: br" though. > Not possible (unless you're talking double gzip). gzip's max compression ratio is 1032:1[1]. So 100kB can expand to at most ~103MB with single gzip. Not sure if I understand the rest of your argument though. If the critique is that it's not possible and that I'm lying(?) about the unpacked file size on the proxy side, then below you'll find the answer from my pentester's perspective. Note that the goal is not to have a valid gzip archive, the goal is to have an as big as possible gzip archive _while unpacking_ and before the integrity and alignment checks. In order to understand how gzip's longest match algorithm works, I recommend to read that specific method first (grep for "longest_match(IPos cur_match)" in case it changes in the future): https://git.savannah.gnu.org/cgit/gzip.git/tree/deflate.c#37... The beauty of C code that "cleverly" uses pointers and scan windows with offsets is that there's always a technique to exploit it. I'll leave that for the reader to understand how the scan condition for unaligned windows can be modified so that the "good_match" and "MAX_MATCH" conditions are avoided, which, in return, leads to bigger chains than the 258 / 4096 bytes the linked StackOverflow answer was talking about :) |
Are you saying you can modify your gzip compression code locally to generate a malformed gzip file? That wouldn't be exploiting deflate.c , that would be exploiting the receiver's decompression code, which might be inflate.c or some other implementation of gzip decompression, which might be in some other language. The language used by the compression code doesn't seem relevant to me, rather it's the language used by the decompression code that might have vulnerabilities that can be exploited. If you have a compressed gzip file that expands to more than 1032:1, the file itself is a proof of concept of the vulnerability; it doesn't matter whether the file was generated by C, Rust, Python, or in a hex editor by hand.
If you've found something in gzip code that causes it to use significantly more memory or disk space than it should (either during compression or decompression), I think that's a denial or service vulnerability and should be reported to gzip.
[1] https://git.savannah.gnu.org/cgit/gzip.git/tree/inflate.c