| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jakobegger 2924 days ago
	Yes, and "certificates" sounds like "certiticates". Reminds me of a story about a copying machine that had a image compression algorithm for scans which changed some numbers on the scanned page to make the compressed image smaller. (Can't remember where I read about that, must have been a couple years ago on HN)

2 comments

raphlinus 2924 days ago

It's the lossy jbig2 compression in Xerox copiers: http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_...

And yes, I think this is a relevant comparison. As the entropy model becomes more sophisticated, errors are more likely to be plausible texts with different meaning, and less likely to be degraded in ways that human processing can intuitively detect and compensate for.

link

misnome 2924 days ago

> t's the lossy jbig2 compression in Xerox copiers: http://www.dkriesel.com/en/blog/2013/0802_xerox-workcentres_....

My understanding of this fault was that it was a bug in their implementation of JBIG2, not the actual compression? Linked article seems to support this.

link

raphlinus 2924 days ago

I think it was just overly aggressive settings of compression parameters. I don't see any evidence that the jbig2 compressor was implemented incorrectly. Source: [1]

[1]: https://www.xerox.com/assets/pdf/ScanningQAincludingAppendix...

link

jay-anderson 2923 days ago

Right. Jbig2 supports lossless compression. I'm not very familiar with the bug, but it could have been a setting somewhere in the scanner/copier that it was changed to lossy compression instead. Or they had lossy compression on by default or misconfigured some other way (probably a bad idea for text documents).

link

namibj 2923 days ago

The bad thing was that it used lossy compression when copying. That was the problem.

link

gsich 2923 days ago

No. The bug was when using the "Scan to PDF" function. It happened on all quality settings. Copying (scanning+printing in one step, no PDF) was not effected.

link

Dylan16807 2923 days ago

No compression system in the world forces you to share parts of the image that shouldn't be shared. So that's true in a vacuous sense.

But the nature of the algorithm means that you have this danger by default. So it's fair to put some blame there.

link

c3534l 2923 days ago

This is a big rabbit hole of issues I'd never even considered before. Should we be striving to hide our mistakes by making our best guess, or make a guess, that if wrong, is easy to detect?

link

tinus_hn 2923 days ago

The algorithm detected similar patterns and replaced these with references. This lead to characters being changed into similar looking characters that also appeared on the page.

link

eboyjr 2924 days ago

Xerox copier flaw changes numbers in scanned docs: https://www.theregister.co.uk/2013/08/06/xerox_copier_flaw_m...

link