Hacker News new | ask | show | jobs
by bhaak 3800 days ago
"Millions of people and businesses depend on GitHub"

Well, we shouldn't depend on it so much.

I shudder at the thought what an outage of GitHub would mean for our company. This time, we were lucky as it was during the night in Europe.

Unfortunately, I don't have the power to test this scenario in our company.

4 comments

I like others am confused by this common sentiment. Github is the remote repo, but the version control is distributed so everyone has a copy. I'm pretty sure I can fill a few hours or more with work needing to be done on my local repo. FYI I'm not a professional software developer but I would like to know.

The things that come to mind: issue trackers, messaging, not being able to see latest pull requests.

Update: Now i'm starting to understand the build dependency issue. Still, why do you need to rebuild all dependencies from GitHub repo to build the application? Can't the currently available version work?

Continuous integration, continuous delivery. Your Jenkins jobs all point to repos on GitHub? Do you plan to fix every single url? Some tools actually pull stuff from GitHub. If you don't have a mirror privately somewhere, where do you push your code? How can you tell you actually own the latest of everything? Time to compare with every co-worker.
It shouldn't really have much effect. One of git's major selling points is that it's a DVCS, meaning that everyone has a local copy of the repository. Perhaps some collaboration features will be down for a couple of hours (which I think is a downside to GitHub's decision not to put issues/PR history inside of git), but everyone should still be able to do work, commit to the repo, review history, and so forth. If you have people who do code, they can probably find something to work on for two hours without having the Issues/PR interface, right?
All sorts of other dependencies go down though. Packages you need for your build aren't there. CI or testing integrations don't happen. Code review is probably not happening. If you track issues in GH you can't see what's next to work on or look up requirements.

You're right in that you're (probably) not totally deadlocked. But I can't start to estimate the lost $$ in productivity that comes with a global GH outage because of all that.

Have a local repos that mirrors the master one on GitHub periodically

Should that fail, start working on the local repos until github is back, then sync back to it

Depending on your definition of "periodically" you may lose almost as much time to syncing back than the outage would have caused without the local mirrors.
I've written scripts that do this. Any request for a repo is polled against the local repo server that makes sure it has the repo, and then quickly checks to see if the repo's out of date, caching the resulting file if the repo can be reached. If the repo can't be reached, just have the proxy deliver the old fileset. So the local repo gets updated, or at least attempts to update, with every hit against it. I had some other logic in the script to only check freshness every 10-15 minutes, so that during times when a lot of machines were pulling, they were essentially guaranteed to all get the same version.
This is certainly one of the better ways to do it - when I see a word like periodically I assume it means daily/weekly/on some sort of calendar-based schedule, which isn't necessarily the case here.
Why is the master on GitHub anyway?

If a company can't maintain their own internal tools and self-hosting servers, why does the same company think it can run reliable services for users?

Not putting the core of your business on a remote platform is disaster mitigation 101.

Why use AWS, GCE or any other virtualization provider? I suspect for some subset of companies the answer is the same.

Relying on Github is not the problem, relying on Github to be available 24/7 is. Github provides a free master node for your eventually consistent database needs, where the database is git. The eventual portion is key here.

The key word is infrastructure.

Github should not be the master, it should be a mirror of a company master that they host on their own server.

The main problem with that is some company do not want the cost of the infra + the cost of the sysadmin to set that up, etc.

The second problem is the build, even if you host your own repos, if all your dependencies are on github and you don't include them in the repo, then you are bust.