|
|
|
|
|
by kris_wayton
1236 days ago
|
|
I ran it, and it installs these python extensions: Successfully installed PyMuPDF-1.21.1 fire-0.5.0 fonttools-4.38.0 lxml-4.9.2 numpy-1.24.1 opencv-python-4.7.0.68 pdf2docx-0.5.6 python-docx-0.8.11 six-1.16.0 termcolor-2.2.0
|
|
So, it's a wrapper around not panddoc but pdf2docx,
https://github.com/dothinking/pdf2docx
which parses PDF via PyMuPDF,
https://github.com/pymupdf/PyMuPDF
which is a wrapper around MuPDF (which does the heavy lifting parsing PDF),
https://mupdf.com/
and writes DOCX via python-docx,
https://github.com/python-openxml/python-docx