|
|
|
|
|
by thanksforfish
2262 days ago
|
|
This is a good point, but it could be worded better. One of the things a site reliability engineer should think about is how well the site can be operated when dependencies have issue. After an incident like this, even if you were able to recover, it's worth thinking about how things could have gone better. In the past I had a painful experience with one application I was supporting that needed to install NPM packages on deployment. We couldn't successfully deploy (or scale up) for the duration of that outage. In that case we realized it was safer to switch to server images with all assets pre-installed and an NPM cache to give the build a better chance of succeeding. The next NPM outage we only noticed after the fact :) Not certain how this particular deployment pipeline is failing due to the GitHub outage, but a post-mortem to discuss may be helpful and protect against future issues. |
|
It seems the convenience of cloud based deployment pipelines is not really worth situations like this.