|
|
|
|
|
by aidenn0
680 days ago
|
|
Huh, I tried with the version from pip (instead of my package manager) and it completes in 22s. Output on the only page I tested is considerably worse than tesseract, particularly with punctuation. The paragraph detection seemed to not work at all, rendering the entire thing on a single line. Even worse for my uses, Tesseract had two mistakes on this page (part of why I picked it), and neither of them were correctly read by EasyOCR. Partial list of mistakes: 1. Missed several full-stops at the end of sentences 2. Rendered two full-stops as colons 3. Rendered two commas as semicolons 4. Misrendered every single em-dash in various ways (e.g. "\_~") 5. Missed 4 double-quotes 6. Missed 3 apostrophes, including rendering "I'll" as "Il" 7. All 5 exclamation points were rendered as a lowercase-ell ("l"). Tesseract got 4 correct and missed one. |
|