|
|
|
|
|
by kkielhofner
807 days ago
|
|
FWIW PyMuPDF doesn't do OCR. It extracts embedded text from a PDF, which in some cases is either non-existent or done with poor quality OCR (like some random implementation from whatever it was scanned with). This implementation bolts on Tesseract which IME is typically not the best available. |
|