|
|
|
|
|
by redman25
2063 days ago
|
|
PDFs act more like images than text. I made a tool for diffing PDFs at the visual level a little while ago (http://parepdf.com) because I needed a way to see the explicit differences between PDFs. Diffing PDFs at the textual level is a much harder problem though since lines of text need to be reordered and concatenated with each other. Unfortunately there is nothing built into the format that allows you to know what line belongs with what other line beyond guesswork. |
|
I attempted something similar (https://nicediff.com), and found the textual approach to be basically useless:
Tax form example: https://www.nicediff.com/view/7a5f41ba3c76ae9bb45f42a4faa8b6...