| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by 205guy 4701 days ago

This is an interesting story with lots of odd comments.

First of foremost, I agree that Xerox putting their name on a product which creates an unfaithful copy is corporate suicide. Such an ancient paragon of computer innovation should be able to come up with a clever algorithm that compresses but doesn't substitute image bits.

But...

- The original story[1] didn't mention that the product itself warns against the very thing they are reporting. Did they ignore that warning, did the copier not show it, did they use a setting that did not have the warning? Their further posts cover the issue, so it looks like somebody else set the resolution and ignored the warning.

- Calling what the JBIG2 algorithm does "OCR" is misleading. OCR is pretty much understood to be analog text (image) to digital text (ASCII, UTF-32). Matching to a real character set and outputting those characters is a defining part of true OCR. It's also confusing because the copiers have a true OCR function, and this is not related. What JBIG2 does, I would call it "sub-image matching and substitution."

- Calling JBIG2 "lossy" is also misleading. I suppose it is lossy by definition, but lossy is usually limited to pixel effects as seen in JPG, no image blocks.

- JBIG2 seems like an algorithm that shouldn't be used on low-res text documents. You might say it's just a configuration of the algorithm, but if engineers can't take it as a tool and use it correctly, you start to wonder if it's a problem with the tool.

[1] http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_...?