|
|
|
|
|
by NewDimension
2372 days ago
|
|
I prefer something I can install locally (doesn't need to be open source). I'm trying to extract text from a PDF at a certain position, the PDF is indeed text not an image so OCR isn't strictly needed. The goal is to draw a box using GUI, then use those coordinates to extract text from several homogeneous pages. I also have a different goal of trying to interpret structure of a PDF that has visual structure (headers, sections and subsections all numbered). But that seems to lend itself to some sort of text parsing. |
|
Some reading here: https://stackoverflow.com/questions/53219016/detecting-secti...