Hacker News new | ask | show | jobs
by ccurrens 1417 days ago
I attended a seminar on the office binary file formats about 10 years ago at MS. The reason it was done was for performance reasons, including the wonky layout that made it quicker to save and read the file from slow media like floppy discs.
1 comments

I also remember reading about that somwhere, sometime... loading... ah, here it is: https://www.joelonsoftware.com/2008/02/19/why-are-the-micros...

> The file format is contorted, where necessary, to make common operations fast. For example, Excel 95 and 97 have something called “Simple Save” which they use sometimes as a faster variation on the OLE compound document format, which just wasn’t fast enough for mainstream use. Word had something called Fast Save. To save a long document quickly, 14 out of 15 times, only the changes are appended to the end of the file, instead of rewriting the whole document from scratch. On the hard drives of the day, this meant saving a long document took one second instead of thirty. (It also meant that deleted data in a document was still in the file. This turned out to be not what people wanted.)

The underlying file format, COM Structured Storage, is basically filesystem-in-a-file, and works much like FAT. So, bits of deleted data would be floating around even without any performance hacks used by the app itself.