Hacker News new | ask | show | jobs
by nemetroid 1234 days ago
They're still using zlib to do the heavy lifting. It's not a large patch.

https://public-inbox.org/git/1328fe72-1a27-b214-c226-d239099...

1 comments

> So the internal implementation takes 17% longer on the Linux repo, but

> uses 2% less CPU time. That's because the external gzip can run in

> parallel on its own processor, while the internal one works sequentially

> and avoids the inter-process communication overhead.

> What are the benefits? Only an internal sequential implementation can

> offer this eco mode, and it allows avoiding the gzip(1) requirement.

It seems like they changed it because it uses less CPU, which makes sense in a "we're a global git hosting company" perspective, but less so for users who run the command themselves. They intentionally made it 17% slower to save 2% of CPU time, which probably makes sense at their scale, but for every user who run the command locally to lose 17% more of time?

This was a change in the upstream git project, I don't think it came from GitHub necessarily?

Looks like the author is the maintainer of "Git for Windows", and similar, which I can imagine makes for a reasonable argument for reducing dependencies. zlib is already a library dependency, just use that instead of needing people to bundle up a gzip binary along with git, too.

https://lore.kernel.org/git/pull.145.git.gitgitgadget@gmail....

Because they pay for the 2% CPU time, not for the 17% local time. In theory the user also pays for 2% less CPU time, but they are much less likely to be CPU limited in their build processes.

Of course 17% more time may not really be that much for most processes. Are we talking about 17% more of a second or of an hour?

It seems like if they really wanted to save CPU they'd be caching the outputs. I fail to see why they would be recompressing years-old release tags. This seems like optimization at the wrong level.

That's without even mentioning the absurdity of saving 2% CPU but still using zlib.