| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by spencerflem 578 days ago
	That's how all compressors work, in that likely files (eg. ASCII, obvious patterns, etc) become smaller and unlikely files become bigger.

3 comments

Dylan16807 578 days ago

> likely files (eg. ASCII, obvious patterns, etc) become smaller

Likely files for a real human workload are like that, but if "most inputs" is talking about the set of all possible files, then that's a whole different ball game and "most inputs" will compress very badly.

> unlikely files become bigger

Yes, but when compressors can't find useful patterns they generally only increase size by a small fraction of a percent. There aren't files that get significantly bigger.

link

PaulHoule 578 days ago

In some cases it can be certain, the ascii encoded in the usual 8 bits has fat to trim even if it is random in that space.

link

Retr0id 576 days ago

Right, but the point was, the case where it became bigger was ~impossible to find.

link

spencerflem 575 days ago

Yeah good point, kinda glossed over that part of the original post. Don't think that that's possible fwiw.

IMO. the fun part of compression algorithms is that the set of files that become smaller is as narrow as possible while the set of files that become bigger is as big as possible, so _most_ files don't compress well! The trick is to get the set of files that get smaller to be just the useful files and nothing else.

link