Hacker News new | ask | show | jobs
by robertknight 1435 days ago
Thanks for this test case. When I drop that image I see that the individual words are recognized correctly, but starting from about mid-way though are not displayed in the correct order in the text box at the bottom. If the image is rotated so that the text baselines are horizontal (about a ~1.5 degree rotation), the words are displayed in the correct order. So it looks like smarter methods or defaults are needed for the layout analysis.

I think with modern methods it ought to be relatively easy to teach a system to predict the amount of rotation needed to straighten the image, or make the layout analysis tolerate minor rotations of the input better. Needs someone to actually implement it though!