Hacker News new | ask | show | jobs
by burnt-resistor 308 days ago
Related to multiple .zip formats: I've found macOS Archive Utility sometimes refuses to extract early pkzip .zips created on MS-DOS, but yet Info-ZIP handles them just fine.

And, the macOS Archive Utility will complain that a proper .tar.bz2 is "corrupt" created using bzip2.

In general, be liberal in input and be conservative in output. Sometimes, this means using less features or certain older formats so that all/most things work without issues.

1 comments

> In general, be liberal in input and be conservative in output.

That is a dangerous maxim in a world with malicious players. In fact this PyPI problem is precisely because zip files are being too readily accepted, even if they have ambiguous meaning. Their fix is (very sensibly) to be less liberal with their input.

No, it's a well-regarded, fundamental engineering principle of standard and interoperable systems.
This hasn't been considered well-regarded for a very long time. I'd say the opposite is dogmatic: we're in a post-Postel world[1], in part because of observed security failures over the last 30 years.

[1]: https://alexgaynor.net/2025/mar/25/postels-law-and-the-three...

Maybe in the early days of the internet, or in closed systems, but in the open internet it's naive and actually leads to more brittle systems.

Sorry to repeat myself, but case in point is the article we're discussing! If something is accepted that is not in the specification then obviously its behaviour is unspecified. That means different implementations can easily have different behaviours, which can lead to security issues exactly like this one.

Even when there's no malice involved, it can often lead to de facto extensions to the specification. If several implementations accept something outside the standard (with roughly the same behaviour), and one accidentally produces it, then soon it becomes relied upon and all implementations need to handle it. Then the standard is no longer authoritative, or has to be retrospectively updated (see for example the huge part of the HTML spec that deals with otherwise invalid HTML).