| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sandreas 2216 days ago
	My approach (in java) was using a set of filters to clean up the image with BoofCV, then using tess4j OCR to make the document searchable and then use Apache PDFBox to create a PDF with invisible text layer. Its not open source yet (i plan to do so), but you could take a look at https://github.com/ctodobom/OpenNoteScanner - which seems to be much more advanced.