|
The XML parsing/validation bugs are, I suppose, not shocking, but deeply disappointing. The one thing XML & its tooling were supposed to get right was document well-formed-ness. Sure, it might be a mess of a standard in other ways, but at least we could agree what a parser should and shouldn’t accept! (Not the case for the HTML tag soup of then or now.) That, 25 years on, a popular XML processor can’t even meet that low bar for tag names is maddening. |
1) Don't rely on two parsers having identical behaviour for security. Yes parsers for the same format should behave the same, but bugs happen, so don't design a system where small differences result in such a catastrophic bug. If you absolutely have to do this, at least use the same parser on both ends.
2) Don't allow layering violations. All content of XML documents is required to be valid in the configured character encoding. That means layer 1 of your decoder should be converting a byte stream into a character stream, and layers 2+ should not even have the opportunity to mess up decoding a character. Efficiency is not a justification, because you can use compile-time techniques to generate the exact same code as if you combined all layers into one. This has the added benefit that it removes edge-cases (if there is one place where bytes are decoded into characters, then you can't get a bug where that decoding is only broken in tag names, and so your test coverage is automatically better).
3) Don't transparently download and install stuff without user interaction, regardless of where it comes from!
4) Revoke certificates for old compromised versions of an installer so that downgrade attacks are not possible.