| HN Mirror

> 1. No one, at least not OP, ever said it's a inherent flaw of JBIG2. The fact it's an implementation error on XeroX's end is a good technical detail to know, but it is irrelevant to the topic.

It is relevant only when you assume that lossy compression has no way to control or even know of such critical changes. In reality most lossy compression algorithms use a rate-distortion optimization, which is only possible when you have some idea about "distortion" in the first place. Given that the error rarely occurred in higher dpis, its cause should have been either a miscalculation of distortion or a misconfiguration of distortion thresholds for patching.

In any case, a correct implementation should be able to do the correct thing. It would have been much problematic if similar cases were repeated, since it would mean that it is much harder to write a correct implementation than expected, but that didn't happen.

> Majority of traditional compressions would make text unreadable when compression is too high or the source material is too low-resolution. They don't substitute one number for another in an "unambiguous" way (i.e. it clearly shows a wrong number instead of just a blurry blob that could be both).

Traditional compressions simply didn't have much computational power to do so. The "blurry blob" is something with lower-frequency components only by definition, and you have only a small number of them, so they were easier to preserve even with limited resources. But if you have and recognize a similar enough pattern, it should be exploited for further compression. Motion compensation in video codecs were already doing a similar thing, and either a filtering or intelligent quantization that preserves higher-frequency components would be able to do so too.

----

> 2. "Lower DPI" is extremely common if your definition for that is 300dpi. At my company, all the text document are scanned at 200dpi by default. And 150dpi or even lower is perfectly readable if you don't use ridiculous compression ratios.

I admit I have generalized too much, but the choice of scan resolution is highly specific to contents, font sizes and even writing systems. If you and your company can cope with lower DPIs, that's good for you, but I believe 300 dpi is indeed the safe minimum.