|
|
|
|
|
by dpapathanasiou
5926 days ago
|
|
Have you thought about the reverse, i.e., a tool that could convert pdfs to html faithfully? I would be willing to pay money for a reliable tool that didn't need much manual editing after processing. Unfortunately, the pdftohtml project (http://pdftohtml.sourceforge.net/) has been inactive, and the current version has trouble with even moderately complex layouts. |
|
The fundamental problem is that PDF stores the document presentation while html defines the document and the presentation is created by the browser. And obviously, to restore a document definition from its presentation is hard as lot of information is missing.