Hacker News new | ask | show | jobs
by xms17189 1 day ago
Interesting approach. Does keeping the model in HTML also preserve enough structure for tracked changes/comments, or do you handle those as a separate layer when converting back to DOCX?
1 comments

Thank you!

My thesis is that an intermediate layer would eventually end up being equivalent to the docx format, so I've decided not to have any intermediate representation.

We convert docx to html and send it AI. When AI rewrites the HTML and it back, we diff the rewritten HTML against the docx's document.xml and make the modification. This is a simplistic explanation of it. There are a bunch of validations and processing going on.

Regarding the tracked changes/comments, we simply invent new HTML tags for those things e.g. <ins>, <del>, <commentRangeStart> and etc.