Y
Hacker News
new
|
ask
|
show
|
jobs
by
ce4
1302 days ago
I use
https://kebekus.gitlab.io/scantools
for scanning, it builds on top of tesseract and works great for pdf enhancements
1 comments
rjzzleep
1302 days ago
You might be interested in
https://github.com/ocrmypdf/OCRmyPDF
then.
It does quite some preprocessing on the PDF pages before passing it on to tesseract.
link
angrygoat
1302 days ago
I've found ocrmypdf to be excellent: the only issue I've had is with PDFs with differing page sizes; it seems to scale everything up to the size of the largest page, which can be a bit of a pain.
link
It does quite some preprocessing on the PDF pages before passing it on to tesseract.