Hacker News new | ask | show | jobs
by vmchale 2139 days ago
Surprised they didn't look more at zstd.

IME it's faster than brotli and often has a better compression ratio.

3 comments

We heavily investigated zstd and met with the brilliant inventor, Yann, who provided amazing insights into the design and rationale behind zstd and why it is so fast and such an amazing technology. I also recompiled zstd into rust using https://github.com/immunant/c2rust and tried using various webasm mechanisms to run it (I didn't get webasm quite fast enough, and teaching c2rust to make it safe would be quite a slog).

But the main reason we settled on Brotli was the second order context modeling, which makes a substantial difference in the final size of files stored on Dropbox (several percent on average as I recall, with some files getting much, much smaller). And for the storage of files, especially cold files, every percent improvement imparts a cost savings.

Also, widespread in-browser support of Brotli makes it possible for us to serve the dropbox files directly to browsers in the future (especially since they are concatenatable). Zstd browser support isn't at the same level today.

> the main reason we settled on Brotli was the second order context modeling

This advanced feature is only relevant on reaching compression levels 10 or 11, which are extremely slow. Below that, it's barely used by the encoder, due to memory and cpu taxes.

Given your application has reached speed concerns, and ends up using brotli at compression level 1 in production, you would be surprised to notice that in this speed range, zstd compresses both faster and stronger, by a quite substantial margin.

For long term storage of blocks, we compress at much higher compression levels like you mention. These densely compressed blocks are, in turn, served directly to customers when they download their own files.

For uploads you're right: we'd be theoretically better off with high performing zstd, but there are maintenance costs with maintaining 2 separate compression pipelines that are similar, but different, for upload and downloads.

Plus there is no safe rust zstd compressor and the safe rust zstd decompressor linked in this thread is only recently available and is also several times slower than the safe rust brotli decompressor.

From the blog:

> Pre-coding: Since most of the data residing in our persistent store, Magic Pocket, has already been Brotli compressed using Broccoli, we can avoid recompression on the download path of the block download protocol. These pre-coded Brotli files have a latency advantage, since they can be delivered directly to clients, and a size advantage, since Magic Pocket contains Brotli codings optimized with a higher compression quality level.

It looks like they did, but having an implementation in a memory-safe language was one of their requirements. Learning that was for me the most fascinating part of the article.
A pure-rust implementation of zstd decoder already exists in production : https://github.com/KillingSpark/zstd-rs
Surely Dropbox would have the engineering power to re-implement zstd in a memory safe language if it was sufficiently beneficial.
I'm sure they could implement it technically speaking, but if a compression protocol is not widespread enough to have others doing such a thing, they can probably consider that a sign of how supported it is.