| The opinion in the article misses something fundamental. The complexity is not artificial, it is completely organic and natural. It is incidental complexity born of decades of history, backwards compatibility, lip-service to openness, and regulatory compliance checkbox ticking. It wasn't purposefully added, it just happened. Every large document-based application's file format is like this, no exceptions. As a random example, Adobe Photoshop PSD files are famously horrific to parse, let alone interpret in any useful way. There are many, many other examples, I don't aim to single out any particular vendor. All of this boils down to the simple fact that these file formats have no independent existence apart from their editor programs. They're simply serialised application state, little better than memory-dumps. They encode every single feature the application has, directly. They must! Otherwise the feature states couldn't be saved. It's tautological. If it's in Word, Excel, PowerPoint, or any other Office app somewhere, it has to go into the files too. There are layers and layers of this history and complex internal state that has to be represented in the file. Everything from compatibility flags, OLE embedding, macros, external data source, incremental saves, the support for quirks of legacy printers that no longer exist, CYMK, external data, document signing, document review notes, and on and on. No extra complexity had to be added to the OOXML file formats, that's just a reflection of the complexity of Microsoft Office applications. Simplicity was never engineered into these file formats. If it had been, it would have been a tremendous extra effort for zero gain to Microsoft. Don't blame Microsoft for this either, because other vendors did the exact same thing, for the exact same pragmatic reasons. |
What do they expect people to do, remove features in order to support other formats? Users won't like that.