| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by moonchild 740 days ago
	they are talking about treating ocr as lossy. i wonder about making a lossless compression algorithm for text scans based on an ocr; in effect, use the ocr to predict which text will show up and how, and then encode the pixel-level differences on top of that

1 comments

DjVu does this to some extent, identifying identifical glyph bitmaps and reusing them for compression. See https://en.m.wikipedia.org/wiki/DjVu#Compression