Hacker News new | ask | show | jobs
by aperrin 1931 days ago
The program runs with Python and Tesseract. It is quite fast (less than one second for a table of 100 numbers) though I never tested it with larger tables. It detects numbers from an image of a table, which is supposed not to be rotated and also cropped : only the table is visible on the image. So, in order to process multiple tables per image, one needs to create an image for each table. This program is rather simple I must say. ;-)

As for the handwriting, I think Tesseract can handle the recognition if the writing is good, but the table needs to fullfil the expected hypothesis. Also the pre-processing can't get rid of a lot of noise so it can be a problem too !