|
|
|
|
|
by mpeg
878 days ago
|
|
Same here, fitz is great, it does well enough out of the box that I can apply some simple heuristics for things like joining/splitting paragraphs where it makes a mistake and extract drawings and such and get pretty close to 100% accuracy on the output. The only thing it doesn't do is tables detection (neither does pdfminer.six), but there are plenty of other ways to handle them. |
|