Hacker News new | ask | show | jobs
by asimpletune 875 days ago
I'm not sure I understand this at all.

> The problem is the way Git works, it clones your entire repository into the container with your cloud environment, using a slow network protocol.

What about git's network protocol is 'slow'?

I think I can also come up with a pretty simple experiment to prove or disprove this: 1. Fill a file with 13Gb of data and commit it. 2. Upload that to GitHub or wherever you want 3. Time how long it takes to clone and compare that to the real GitHub.com

You will find the one we made takes 'seconds' (or minutes, depending on your network connection), while the the GitHub.com will take some time.

So, same data, two different results? The difference in this experiment rules out the 'slow' network protocol as the difference maker. The real reason is that the GitHub.com repo will have hundreds or thousands of commits.

Basically, the difference is the commit history, because that's how git needs to work. Git stores the diffs for the entire commit history, not just the literal files at the HEAD. I don't know what the network protocol has to do with that.

2 comments

It is perhaps worth pointing out that if you don't need the history you can just `git clone --depth 1` and save the network transfer and disk space.
It reminds of when someone told me git submodules are slow.

They just forgot about shallow clones..

If you use the dumb http protocol, both cases should be equally fast.
git clone https://github.com/github/docs.git 123.57s user 37.02s system 74% cpu 3:35.73 total

git clone --depth 1 https://github.com/github/docs.git 3.37s user 1.83s system 35% cpu 14.521 total

Not a scientific test at all, but the second one was literally 15x faster, wall clock time.