| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by faustomorales 2362 days ago

Hi HN! I made this because I wanted a toolkit for training custom OCR models that included both text detection and recognition along with the necessary tools to create synthetic data. Existing synthetic data generators had more dependencies and set-up than I felt was absolutely necessary so I took a different tack that limited dependencies to PIL only.

Some use cases for this package:

- You can use the pretrained (trained by others!) models for OCR (see the README for an example) on English text. [0]

- You can fine-tune a version of the detection and recognition models on a different alphabet / language (see the tutorial [1]).

- You can just use the data generator with backgrounds and fonts (I provide a packaged set of both) to create images with character-level annotations for some other model [2].

I'd really like to continue improving the image generator to render more realistic images while retaining the existing mix of simplicity / flexibility. Ideas welcome!

[0] https://keras-ocr.readthedocs.io/en/latest/examples/using_pr...

[1] https://keras-ocr.readthedocs.io/en/latest/examples/end_to_e...

[2] https://keras-ocr.readthedocs.io/en/latest/examples/end_to_e...

1 comments

sansnomme 2362 days ago

You should market this more, there is a severe need for a Tesseract replacement, it has been falling behind current state of the art, especially compared to many cloud offerings.

link

faustomorales 2357 days ago

Tesseract is a great solution for scanning books -- but agree that it doesn't work very well for most other use cases, especially when compared to cloud providers. FWIW, I have started trying to compare keras-ocr against the cloud options. [1]

[1] https://github.com/faustomorales/keras-ocr#comparing-keras-o...

link