Hacker News new | ask | show | jobs
by kelnos 1238 days ago
The thing I don't get is how this ever worked.

The change was upstream from git itself, and it was to use the builtin (zlib-based) compression code in git, rather than shelling out to gzip.

But would the gzip binary itself give reproducible results across versions of gzip (and zlib)? Intuition seems to suggest it wouldn't, at least not always. And if not, was the "strategy" just to never update gzip or zlib on GitHub's servers? That seems like a non-starter...

1 comments

gzip is 28 years old. I don't think the output changes anymore.
There is no reason to believe that it won't. Even after 28 years, there could be improvements merged for the compressor. Or perhaps especially after 28 years - we have a lot more memory now but it is slower when compared to our CPUs than it used to be so there is most likely room for tuning. Similar for patches that make use of newer CPU instructions - why would you expect them to take care to produce the exact same output rather than just the best compression possible for a perf budget.
That's the whole point, it wasn't an enforced contract but just happened to not change in a long time so it was assumed to be part of the contract. The majority of users don't know how exactly GitHub is serving these archives, they just assume (incorrectly, but reasonably) if they download from this URL they'll always get the same archive bit for bit. That assumption has grown stronger and stronger over time the longer they remained the same, until today.