Hacker News new | ask | show | jobs
by jcuenod 1241 days ago
What are the chances? I've just started my live beta of https://fixpdfs.com to convert pdfs of scanned documents/books into better "documents" with OCR, normalized margins, etc. (for better reading, searching, and highlighting)
1 comments

It was my experience that OCRing scanned PDFs, would result in many small errors. For example “Alt” could be interpreted as “A|t”. Did you had those problems? How did you fixed it? What about other languages?
I didn't build my own OCR models, in the beta I'm using tesseract but I'm going to use google or amazon when I start charging. There's no way to compete on OCR quality but I don't see other products automatically fixing doc scans, which is the value add I see my software really giving...