Hacker News new | ask | show | jobs
by amelius 3596 days ago
The next step would be to "depixelate" the resulting image. How could this be done? I guess OCR would not work because of the variation of the fonts (you don't want the document to end up in a single font; you want to keep the fonts). Could a deep learning approach work here, even if it has not been trained on all the specific fonts?
1 comments

Plenty of engines will do OCR and use the shapes recognised with high certainty to affect how they detect the rest.

There are many ways of doing this, and you can achieve some results even without knowing if your image is text, but just has lots of self-similarity by virtually sliding a "grid" over the image, slicing it up into n-by-n squares, running any of a number of nearest-neighbour variants over it, and then for each cluster replace all instances of the squares in the cluster by the one which minimise the overall error rate vs the others.

This will work reasonably well for very structured images such as text, as long as enough characters are near correct, and will retain custom fonts etc. but clean them up quite a bit as long as they either are different enough, or occur often enough on a page to not get "corrected".

I'm sure there are better ways of doing this too - it's been a decade since I kept up with OCR research.