Y
Hacker News
new
|
ask
|
show
|
jobs
by
mythrwy
1308 days ago
There is a command line utility (pdf2text) that will also parse the pdf to an XML tree and you can query with XPaths. I found it works well.
https://pdfminersix.readthedocs.io/en/latest/reference/comma...
1 comments
mdaniel
1308 days ago
That makes sense, as "pdfquery" uses pdfminer.six as a dep:
https://github.com/jcushman/pdfquery/blob/master/requirement...
link