| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Dylan16807 81 days ago
	Jbig2 dynamically pulls reference chunks out of the image, which makes it more likely to have insufficient separation between the target shapes. It also gives a false sense of security when it displays dirty pixels that still clearly show a specific digit, since you think you're basically looking at the original.

1 comments

thaumasiotes 81 days ago

That's a description of Jbig2, not a description of OCR.

Jbig2 is an OCR algorithm that doesn't assume the document comes from a pre-existing alphabet.

link

Dylan16807 81 days ago

You asked what the difference was, and I said the difference. Was it unclear that to fit the phrasing of your question, we add "OCR doesn't"? I would not personally call Jbig2 OCR.

link

thaumasiotes 80 days ago

> You asked what the difference was, and I said the difference.

Take another look at my comment.

link

Dylan16807 80 days ago

Let me try rephrasing to make the response to your original comment as clear as possible.

Question: "How can we describe OCR that wouldn't match this definition exactly?"

Answer: This definition largely fits OCR, but "reference to a single instance" is a weird way to phrase it. A better definition of OCR would include how it uses builtin knowledge of glyphs and text structure, unlike JBIG2 which looks for examples dynamically. And that difference in technique gives you a significant difference in the end results.

Is that better?

The definition you quoted is not an "exact" fit to OCR, it's a mildly misleading fit to OCR, and clearing up the misleading part makes it no longer fit both.

link