Hacker News new | ask | show | jobs
by dspillett 3369 days ago
> If you gzip data over the line it's already compressed. So minifying your stuff will only help you a little.

For small files you might be mostly correct, but for larger ones min+compress can product much better gains than compression alone.

IIRC the algorithm used employs a rolling compression window, and can only match strings of tokens whose distance apart is smaller than that window. IIRC the default window is 8KBytes and the maximum is 32KBytes. Even if you use the maximum at the expense of CPU time that isn't going to cover many large files. Minifying increases the effective range of the compression window, each match is shorter but you will find more matches and usually this balances out in a way that benefits the compression result.

It isn't quite that simple in reality as there is huffman encoding and other tricks in the mix. This means that even for inputs smaller than the compression window you may see some benefit as minifying can reduce the input data's alphabet significantly.

Ignoring the "why it helps", it is easy to show that it does help in a great many real cases:

  ds@s2:/tmp$ wget --quiet https://code.jquery.com/jquery-3.2.0.min.js
  ds@s2:/tmp$ wget --quiet https://code.jquery.com/jquery-3.2.0.js
  ds@s2:/tmp$ gzip jquery-3.2.0.min.js
  ds@s2:/tmp$ gzip jquery-3.2.0.js
  ds@s2:/tmp$ ls -l j*
  -rw-r--r-- 1 ds ds 79201 Mar 16 21:30 jquery-3.2.0.js.gz
  -rw-r--r-- 1 ds ds 30023 Mar 16 21:30 jquery-3.2.0.min.js.gz
In this example the result of min+comp is less than 40% the size of the result from compression alone.

For completeness, minifying alone achieves less than compression alone:

  -rw-r--r-- 1 ds ds 267686 Mar 16 21:30 jquery-3.2.0.js
  -rw-r--r-- 1 ds ds  79201 Mar 16 21:30 jquery-3.2.0.js.gz
  -rw-r--r-- 1 ds ds  86596 Mar 16 21:30 jquery-3.2.0.min.js
  -rw-r--r-- 1 ds ds  30023 Mar 16 21:30 jquery-3.2.0.min.js.gz
One further factor is CPU time consumed on the client decompressing and parsing the content but this is likely to be insignificant compared to the network or local IO time, if a device's CPU is under-powered enough that this is significant then it is unlikely to be able to run the decompressed code with useful performance.
2 comments

Most of the gains in there are from stripping out comments. That plus whitespace removal gets you most of the benefit. I don't think the parent was advocating for dropping minification completely, but investing massive effort when you're already at the crest of the curve.
Stripping out comments would be one, but eventually remove all useless code and optimizing it using tools like Google Closure Compiler is way more effective in most websites that use a single bundle for everything.
When the subject is CSS, dead code removal is a way more complex problem, and only possible if your usage falls within certain constraints. Best bet is a component system with scopes styles that ensures you are only loading what is required.
It's not like this is a zero sum game.

Attempts at improvement don't hurt at all, and in some cases can help a ton.

But they can hurt sometimes -- not all optimizations are always safe.
Of course, but there is still value in "unsafe optimizations" for those who won't be impacted by them.
Are you sure that's not just comments being removed?