|
I'm amazed at how much this has knocked me on my arse. I first attempted to redo the README for a service I've just open-sourced, before realising Github is down. Then, I attempted to fix the company CI server (OOM errors because of Carrierwave needing more than 500MB of memory to run 1 spec in, for some unknown reason), which failed because it couldn't check out the code. After giving up on that, I attempted to install Graphite to a company server, where I hit another roadblock because the downloads are hosted on Github, and so I had to use Launchpad, which I had an allergic reaction to. Also, when I was shelling into the server, oh-my-zsh failed to update because, you guessed it, Github was down. Still, shouts to the ops team in the trenches, we're rooting for you. |
I know that in theory a cloud solution should have a higher uptime than an amateuristic set up private server, but cloud solutions have a certain complexity and coherence that make them very vulnerable to these kinds of 'full failures' where nothing is in your control.
Maybe you should take this time to learn from this, and analyze what you could do to reduce the impact of this failure. For example, you could research what it would take for your company to move to another Git provider, perhaps even on your own server or a VM slice at some cloud provider.
I'm not saying you should drop github, because obviously they have great service, but be realistic about cloud service.
Cloud service is like RAID: it is not a backup.
The way RAID is nice for recovering from errors without downtime, there is a chance something bigger happens and you still lose your data cloud is nice for offering scalability and availability but there's a chance everything goes down and you still can't run your operations.