| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mananaysiempre 1829 days ago

The use of Content-Encoding for compression is actually something of a historical wart: what was intended to be used for that purpose is Transfer-Encoding, but modern browsers don’t even send the TE header necessary to permit the HTTP server to use it (except for Transfer-Encoding: chunked which every HTTP 1.1 client must accept), even though some servers are perfectly capable of it and all but the most broken will at least ignore it. Things like 7bit, 8bit, binary, or quoted-printable are not supposed to be in the HTTP Content-Encoding header, either, but their presence is at least somewhat understandable as they are valid in the MIME Content-Transfer-Encoding header, and HTTP originally shares much of its infrastructure with MIME (think Content-Disposition: attachment).

I guess what I’m getting at here is that the blame for the C-E weirdness lies in large part on the browsers, which could’ve made a clean break and improved the semantics at the same time by using T-E, but instead chose to initiate a chicken-and-egg dilemma out of a desire to support broken HTTP servers from the last century.

(The intended semantics is that C-E, an “end-to-end” header, says “this resource genuinely exists in this encoded form”, while T-E, a “hop-to-hop” header, says “the origin or proxy server you’re using incidentally chose to encode this resource in this form”; this is why sometimes the wrong combination of hacks in the HTTP server and the Web browser will lead you to downloading a tar file when you expected a tar.gz file.)

The use of “gzip” as the compression is also a wart, because it’s “deflate” (which is what you want: DEFLATE compression with a checksum) with a useless decompressed filename (wat?) + decompressed mtime (double wat?) header stacked on top.

2 comments

nayuki 1828 days ago

Even though HTTP DEFLATE saves ~20 bytes compared to GZIP, it itself is a wart because of some vendor misunderstandings. HTTP DEFLATE is actually DEFLATE data wrapped in a zlib container, not raw DEFLATE. See https://en.wikipedia.org/wiki/HTTP_compression#Problems_prev... ; https://stackoverflow.com/questions/3932117/handling-http-co...

link

benibela 1828 days ago

I just implemented decompression in my HTTP client this week

I could not test that part because both server I tried send raw deflate, without zlib container

link

jfrunyon 1828 days ago

The original filename is optional in gzip. It is not included in the response sent by, for example, Apache.

(There is a mandatory MTIME which is included, and an OS byte, but those only waste 5 bytes total. Far less than gzip will typically save.)

link