| This article again? In my opinion, this article is biased. The subtext here is that the author is claiming that his "lzip" format is superior. But xz was not chosen "blindly" as the article claims. To me, most of the claims are arguable. To say 3 levels of headers is "unsafe complexity"... I don't agree. Indirection is fundamental to design. To say padding is "useless"... I don't understand why padding and byte-alignment that is given so much vitriol. Look at how much padding the tar format has. And tar is a good example of how "useless padding" was used to extend the format to support larger files. So this supposed "flaw" has been in tar for dozens of years, with no disastrous effects at all. The xz decision was not made "blindly". There was thought behind the decision. And it's pure FUD to say "Xz implementations may choose what subset of the format they support. They may even choose to not support integrity checking at all. Safe interoperability among xz implementations is not guaranteed". You could say this about any software - "oh no, someone might make a bad implementation!" Format fragmentation is essentially a social problem more than a technical problem. I'll leave it at this for now, but there's more I could write. |
3 individual headers for one file format is unnecessary complexity.
> To say padding is "useless"
Padding in general is not useless, but padding in a compression format is very counterproductive.
> And it's pure FUD to say "Xz implementations may choose what subset of the format they support. They may even choose to not support integrity checking at all. Safe interoperability among xz implementations is not guaranteed". You could say this about any software - "oh no, someone might make a bad implementation!" Format fragmentation is essentially a social problem more than a technical problem.
This isn't about "someone making a bad implementation!", it's about crucial features being optional. That is, completely compliant implementations may or may not be able to decompress a given XZ archive, and may or may not be able to validate the archive.
XZ may not have been chosen blindly, but it certainly does not seem like a sensible format. There is no benefit to this complexity. We do not need or benefit from a format that is flexible, as we can just swap format and tool if we want to swap algorithms, like we have done so many times before (a proper compression format is just a tiny algorithm-specific header + trailing checksum, so it is not worth generalizing away).
Any and all benefits of XZ lie in LZMA2. We could have lzip2 and avoid all of these problems.
(I have no opinion as to whether LZIP should supersede GZIP/BZIP2, but XZ certainly seems like a poor choice.)