| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by unsigner 2011 days ago
	Zstd is very different - it includes an entropy coder. LZ4 only finds repeated matches, but then doesn't encode them very efficiently. To put it simplistically, if you have a file which is a (good) random mix of an equal number A and B characters, LZ4 won't be able to compress it significantly, while Zstd will compress it 8:1 converging to an encoding where a '1' bit is A, and a '0' bit is B.

1 comments

YetAnotherNick 2011 days ago

> To put it simplistically, if you have a file which is a (good) random mix of an equal number A and B characters, LZ4 won't be able to compress it significantly

I checked it. LZ4 is still reducing the size to half, no idea why half. So for 10 MB file it compresses to 5 MB.

Edit: checked with highest compression and it compresses 1MB file to 185KB. So what the parent wrote is false.

link

not2b 2011 days ago

Yes, if I take the 8 combinations aaa, aab, aba etc and assign each of them a 9 bit codeword I replace each 24 bit sequence with a 9 bit sequence. So arithmetic coders have no problem with cases like this.

link

unsigner 2006 days ago

but LZ4 doesn't have a arithmetic coder, or any other statistical encoding - it's just matches and literals. Puzzling...

link