Github is great, they're working on some serious problems, and every service has downtime.
But this is getting pretty serious.
This is the 3rd time in these last few weeks there has been a "significant service outage"
So much of a typical dev workflow is based around github, but more importantly a lot of new package managers use github as a base. Not being able to pull dependencies is a fairly big problem.
I will continue to use github, because they're awesome. At the same time, we're going to have to start building around it to ensure our uptime isn't reliant on their (less than optimal) uptime.
I've found it somewhat odd how, although advocating the use of a decentralized version control system, much of the git community has ended up heavily centralized on GitHub.
Yup, when github barfs, you can still happily keep on hacking.
You're also not limited to one shared repo by any means, so even synchronization between multiple developers can continue when github is down. To avoid problems it's of course a good idea when one of the repos is a "master", and if using a secondary repo for synchronization when the master is down, you'll probably want to switch to unique temporary branches for the purpose.
Another possible backup is any host on the internet. Seriously, dump it into a static file directory on any old web server or run a trivial git server. Send deltas with patch files. It's not even harder than github is, just uglier.
If you want to use your own servers, Atlassian (who own Bitbucket) recently released Stash, a competitor to Github Enterprise. Pricing is very good and if you have a commercial license, you also get the source code.
perhaps there maybe a way to somehow sync-up github w/ bitbucket to act as a single master origin. so we can use either or and they'll be synchronized automatically. or some company that can just act as a mirror to github would be great.
So setup 1 small git server that mirrors your most critical repositories which devs can failover to in the event GitHub is down _exactly_ when you need to integrate changes.
I wouldn't call this "building around" GitHub. Git's nature promotes this kind of thing.
This is what I think may be their biggest problem. Because of the outages, and due to my own stupidity, I have been caught with my pants down twice because Github was always just up. And then I thought, well, lesson learned, they won't go down again, and ker-pow (I'm stupid like that).
So I have had to create a secondary remote, and for private projects, it's really easy to do so. Which is kinda problematic, because those are the accounts that keep the lights on for Github. If I have to cut budget, it's a lot more likely now that I would consider cutting the private repos.
I just hope they figure this out soon, so I can forget about how easy it was.
The beauty of decentralized version control is a node going down doesn't kill your workflow. In a pinch, you could use git bundle to exchange commits with other developers:
Yeah, don't do that. As a scala/maven guy it kills me to see all these package managers making the same mistakes we already solved, over and over again.
Because it's tying you to an unreliable third-party service, and there's no way to mitigate it. Artifact dependencies really shouldn't be in the same place as source, they fulfil different roles.
What you want is a dedicated repository format. Libraries can still be hosted by whoever maintains them on their own repository (which can be their own piece of software as long as it follows the standards), or in the community central repository. But either way, if you depend on those libraries and want to lower your risks, it's trivial to set up a local mirror and make sure all your third party dependencies come in via this mirror. That way if their repository goes down temporarily or permanently it's no problem, and you ensure your builds remain reproducible.
The most infuriating part is, the software to do this already exists. If you want to start a new language, great. But please, use maven; otherwise you are doomed to re-invent it, poorly.
I'm not 100% positive on the specifics, but I've never seen it used. I suspect its not possible.
(context: I make https://circleci.com - a continuous integration company for web apps, often Rails. We occasionally get support requests that allow/ask us to look at Gemfiles, so I've seen an above average number of Gemfiles. However, I more often see the stdout of the `bundle install` command, which shows GitHub being accessed).
You can also specify multiple gem sources (http://gembundler.com/v1.2/gemfile.html), but usually only rubygems.org is used unless you need a private geù repository.
Yeah the question is, can you specify to grab rails from github with a fallback if it is not up? So the same system it seems to be using with multiple gem sources but on a gem by gem basis like you would do with a github repo.
Many other companies are able to implement web services without the reliability issues GitHub has experienced. As your parent poster explained, some unreliability is always tolerable, but GitHub is starting to cross that threshold. Your list of fallacies does nothing to address that very real complaint.
I get outages are a way of life with a widely used web application, but Github have really dropped the ball lately. This is one of many service outages lately and as a paying customer it's disheartening and worrying because I use Github in my day-to-day workflow, I and many others have come to rely on it. Don't get me wrong here, I love Github and couldn't live without it, but they really need to sort out these problems and it's not like they don't have the funds to address the issues anyway. My knowledge of distributed computing is somewhat limited, but I would have thought they'd just spin up a few extra virtual machines to handle the database spike (maybe it's not that simple with Github's setup, I'm not sure).
Github have their own datacenter and hardware, they are not relying on any cloud provider out there. It makes it harder to handle spikes of load, but they actually have an I/O intense service which justifies that choice.
My favourite line from that linked blog post from 2009 is this: "We're aware of the current stability and performance issues, and we want to let you know what we're doing about it." - issues they were having nearly 4 years ago are still happening unless the problems they've faced lately are completely different.
There were a period where they were really having visible scaling problems, with response times often getting painfully slow - apparently especially due to slow I/O. These problems completely disappeared after their move to Rackspace's managed hosting.
Now it looks like they might be hitting a new barrier that might require architectural changes to overcome.
I understand when my non-technical clients ask questions like, "I thought we fixed X?" or "shouldn't Y have been fixed by now?", but I would have hoped that fellow developers would cut each other some more slack, you know?
There are an infinite number of reasons an app could fail at any given moment. Just because we see the same "something went wrong" page 4 years later doesn't mean we're seeing the same problem, or even the same class of problem. It just means that the sun still rose in the east this morning, my stupid car will probably have something new wrong with it when I leave my apartment in a few hours, and software is still really, really complex.
The GitHub team is a smart bunch. I highly doubt they're still dealing with the same class of issues that plagued them four years ago, and it just seems kinda strange to even bring it up.
But this is getting pretty serious.
This is the 3rd time in these last few weeks there has been a "significant service outage"
So much of a typical dev workflow is based around github, but more importantly a lot of new package managers use github as a base. Not being able to pull dependencies is a fairly big problem.
I will continue to use github, because they're awesome. At the same time, we're going to have to start building around it to ensure our uptime isn't reliant on their (less than optimal) uptime.