|
|
|
|
|
by m12k
1556 days ago
|
|
I recently discovered that search-replacing text in a PDF without changing the layout is much harder than I thought it would be (a customer forgot to change their billing address, and now that the invoice is finalized, Stripe won't let me edit anything, so down the PDF-editing rabbithole I went). I would love it if I could just use an API for this. |
|
Sometimes text is positioned absolute to the page border, sometimes relative to other elements, where moving a word shifts all following elements around. There can be multiple matrices involved for positioning text elements. Sometimes text elements are all positioned independently, sometimes by using newlines with custom size. Text elements can span multiple lines or words but sometimes each letter is a single text element where it is even hard to determine, which letters go together or if there's meant to be a space. Additionally fonts can be subsetted, where it's impossible to use other unused letters without knowing the original font. And than there can be OCR'ed PDF's, where an image of scanned text is overlayed on top of the real text. Oh and there can be clipping paths: Rectangles which erase all text below.
And each PDF-Producer creates a different PDF structure.
For reading, PDF's are awesome. For editing, PDF's are a nightmare.