Hacker News new | ask | show | jobs
by ajkjk 3793 days ago
No, it's not the same as a distributed system with master/slave nodes. The child nodes can function entirely in isolation from the parent. If you wanted to, you could treat another coworker's node as your master and download/upload to that. It's usually easier to have a tree structure where the root is your master repo, its children are your build servers or whatever, and the leaves are development machines. But that's entirely reconfigurable.

It's not surprising at all that if you make a master repo at the root of the tree, and it goes down, then you can't communicate it. But it doesn't prohibit any communication between other nodes, or re-wiring the tree, and it definitely doesn't inherently block development work on any of the other nodes.

It just so happens, though, that people's build scripts and package managers like to refresh packages from the root and don't handle failures modes of that operation very well. That's the only place problems emerge - besides the obvious fact that if your public releases of software go through the root, and the root is down, then you can't release until it's up. But you could easily make a new root if you wanted to.

1 comments

"It just so happens, though, that people's build scripts and package managers like to refresh packages from the root and don't handle failures modes of that operation very well. "

That's the critical part. So, countering this risk is apparently a manual thing if one uses off-the-shelf tooling for Git. I'll just have to remember to look at that if I do a deployment. Put it on a checklist or something.

>So, countering this risk is apparently a manual thing if one uses off-the-shelf tooling for Git.

Not so much off-the-shelf tooling for Git, its more off-the-shelf tooling for Node/Ruby/Go/Rust/PHP.

Nothing about Node's npm really requires it to depend on a single GitHub, in fact I think you can use any Git repo. Its just that most tend to use a single Git repo, and there is no way to configure mirrors.

Thanks for the extra detail.

"and there is no way to configure mirrors."

Its that in Git itself or the project-specific tooling you're mentioning?

There is no way to configure mirrors with the project-specific tooling (AFAIK).

Git, (and like most other DVCS) supports mirroring. For example Linux, hosted on Github, (https://github.com/torvalds/linux/commits/master) is also mirrored and hosted on kernel.org (https://git.kernel.org/cgit/). Or, the apache projects (https://github.com/apache/cassandra), which are also hosted on apache.org (https://git-wip-us.apache.org/repos/asf?p=cassandra.git). Generally when commits are merged with upstream, they are mirrored to all other hosts.

The tools, however, are generally configured with only the GitHub address (or the author of the tool only publishes to GitHub), and the tools (unlike say Perl's CPAN) don't offer to maintain mirrors of the libraries published. So when github is down, a tool like npm will give up, even though the author could have another git repo host elsewhere.

> For example Linux, hosted on Github, (https://github.com/torvalds/linux/commits/master) is also mirrored and hosted on kernel.org (https://git.kernel.org/cgit/).

It's the opposite: Linux is hosted on kernel.org, and the mirror on github.com is just something that was created during a kernel.org outage. The canonical address is the kernel.org one.

(The Linux repository on kernel.org, by the way, is one of the oldest git repositories; IIRC, it was created when git was only a few weeks old.)

So, the protocol is definitely good enough to handle situations like this but just commonly applied that way esp with many Github-hosted projects. Gotcha. That makes sense.
Git is very flexible and does not even require repo-to-repo communication over the wire at all; patches can be emailed among contributors and then committed and tracked locally. Branching and merging is so fast and easy in git that every participant can have a slightly different repo for a given project, incorporating shared changes as they see fit.

Github is popular because it is opinionated--it chooses to use git in certain ways, thus reducing the complexity for people who aren't git experts (i.e. most people).

The most sophisticated users of git--the Linux and git projects, probably--do not rely on github at all. As far as I know, they share code via emailed patches. Some of those developers might not even be using git at all! They just send patches upstream, and the upstream developer checks the patches into their local git repo and then preps a larger patch to be emailed farther upstream.

This is a social problem, not a technical one.