Hacker News new | ask | show | jobs
by dmd 4 days ago
So I just tried this with a bunch of medium-complex documents and it's wildly wrong. I suspect the authors have never seen an actually complicated Word document?
3 comments

Do you have any actual examples of documents where it did not work?
My examples are all internal/confidential, but if someone from the project wants an example I could probably do some search/replace redaction. It would be a lot of work though because there's photographs and such too, and indexes, and tables, and documents inserted by reference, cross references, conditional fields, bibliography fields, formula fields, etc etc.
vibe coded, it seems from the commit history (and readme lol)
Office XML is surprisingly complex under the hood. The format packages multiple XML streams, relationships, and content types into a ZIP — making debugging without specialized tooling painful.

Rendering to HTML Canvas is a pragmatic choice. We work with legal documents daily and the fidelity gap between native Office rendering and HTML-based viewers is one of those "last 10%" problems that takes 90% of the effort. Things like tracked changes formatting, table layout inheritance, and nested content controls rarely render correctly in lightweight viewers.

For document-heavy workflows (legal, compliance, procurement), having a viewer that preserves structural fidelity — especially revision marks and annotations — is table stakes. Most web-based solutions we tested lost formatting on documents with complex nested structures.

Interesting approach. Does the Canvas rendering handle tracked changes and inline comments? That is where most viewers break down.

I don't know why this was flagged, but you are right.

Google Docs [1] and OnlyOffice [2] also employ the canvas method to render office documents, and have found it reliable and consistent among different browsers.

[1]: https://workspaceupdates.googleblog.com/2021/05/Google-Docs-...

[2]: https://helpcenter.onlyoffice.com/faq/technology.aspx

The account is flagged because it's a bot account.