Hacker News new | ask | show | jobs
Ask HN: How to merge office documents?
1 points by gehaxelt 3134 days ago
Hello HN community,

I struggle to solve the problem in an acceptable way, so I wanted to ask if HN has any other ideas/tips/tricks? Thanks in advance :)

The situation is a handful office documents in different formats and with different layouts/contents:

- {A,B,C}.odt

- {D,E,F}.doc

- {G,H,I}.docx

I cannot find a good way to merge all documents of either format, because there is no commandline option/tool [0,1,2] or the layout is messed up [3], e.g. ooo_cat.

The expected result should be one single file that contains the contents of the other three files without breaking the layout or other unwanted side effects. Merging the base template files into one big template is unfortunately not an option.

Questions:

Q1) What is the best way to smoothly merge several documents [programmatically/cmd line]?

Q2) The documents are generated using the process below. What is HN's way to generate nice-looking office documents with dynamic data? (Except programming/building the whole document and elements from scratch?)

A document is based on a .fodt [4] template with Jinja2 templating-blocks. The generation process is:

1.) Read the .fodt template file

2.) Populate content with the Jinja2 engine and save the "rendered" .fodt file

3.) Use cmd-line libreoffice to convert the .fodt file to .odt/.doc/.docx

[0] https://ask.libreoffice.org/en/question/19222/how-to-merge-multiple-documents-into-single-merged-document/

[1] https://forum.openoffice.org/en/forum/viewtopic.php?f=7&t=57435

[2] https://support.microsoft.com/en-us/help/2665750/how-to-merge-multiple-word-documents-into-one---eeekb

[3] https://askubuntu.com/questions/482277/how-to-merge-odt-documents-from-the-command-line

[4] https://en.wikipedia.org/wiki/OpenDocument_technical_specification

EDIT: Formatting

2 comments

This is something I did a lot using VBA. I'm not sure how to do it with open document format, but it's really easy using VBA. The MS Word object has a method to combine documents.

I did something similar to what I think you are trying to do. I basically merged data into several different documents, and then I combined those documents into a single document. I could dig up the code if you think that is something you'd be interested in, but I'm not entirely sure what your end goal is.

I don't think there is one.

As someone who works on docx documents all the time, even the method to merge them together is often problematic.

If you have the original data, you should probably try to regenerate them all again since the build chain is clean.

Of course, if you did manage this process, let me know. It would save me time as well.

Hey,

I cross-posted the question on SO [0] and got an interesting answer!

[0] https://stackoverflow.com/questions/47351447/how-to-merge-of...