> So the internal implementation takes 17% longer on the Linux repo, but
> uses 2% less CPU time. That's because the external gzip can run in
> parallel on its own processor, while the internal one works sequentially
> and avoids the inter-process communication overhead.
> What are the benefits? Only an internal sequential implementation can
> offer this eco mode, and it allows avoiding the gzip(1) requirement.
It seems like they changed it because it uses less CPU, which makes sense in a "we're a global git hosting company" perspective, but less so for users who run the command themselves. They intentionally made it 17% slower to save 2% of CPU time, which probably makes sense at their scale, but for every user who run the command locally to lose 17% more of time?
This was a change in the upstream git project, I don't think it came from GitHub necessarily?
Looks like the author is the maintainer of "Git for Windows", and similar, which I can imagine makes for a reasonable argument for reducing dependencies. zlib is already a library dependency, just use that instead of needing people to bundle up a gzip binary along with git, too.
Because they pay for the 2% CPU time, not for the 17% local time. In theory the user also pays for 2% less CPU time, but they are much less likely to be CPU limited in their build processes.
Of course 17% more time may not really be that much for most processes. Are we talking about 17% more of a second or of an hour?
It seems like if they really wanted to save CPU they'd be caching the outputs. I fail to see why they would be recompressing years-old release tags. This seems like optimization at the wrong level.
That's without even mentioning the absurdity of saving 2% CPU but still using zlib.
> uses 2% less CPU time. That's because the external gzip can run in
> parallel on its own processor, while the internal one works sequentially
> and avoids the inter-process communication overhead.
> What are the benefits? Only an internal sequential implementation can
> offer this eco mode, and it allows avoiding the gzip(1) requirement.
It seems like they changed it because it uses less CPU, which makes sense in a "we're a global git hosting company" perspective, but less so for users who run the command themselves. They intentionally made it 17% slower to save 2% of CPU time, which probably makes sense at their scale, but for every user who run the command locally to lose 17% more of time?