Hacker News new | ask | show | jobs
by mjthompson 1839 days ago
This is annoyingly vague. What was the software bug, and what was the valid customer configuration change?

It's perhaps a bit premature to demand it at this point, but I'm hoping a full post-mortem will outline precisely how this change was not picked up in pre-prod. Surely all valid customer configurations must be tested prior to rollout.

2 comments

It's not just annoying value. It's insultingly vague.

If my data centre provider suffered a complete outage, then I demand to get a detailed post-mortem of what happened (in due time). If they just tell me bullshit PR speak about "We value our customers", I'll be looking at switching providers.

As a Fastly customer whose site went down, I'm entitled to know exactly what happened. If they don't tell me, I'm switching CDNs as a matter of priority.

> As a Fastly customer whose site went down, I'm entitled to know exactly what happened.

Does your contract say you're entitled to an RCA?

As others have said, this is more of an update, not a complete RCA on the entire situation. They have short term tasks that they've described in this summary post and I would expect that they will give a more complete analysis later.

It does say they haven't finished rolling out the permanent fix, (i.e. such a customer configuration(/exploit) could still bring down some servers) and will/are conducting a full post-mortem. So hopefully a juicier post to come.
> As a Fastly customer whose site went down, I'm entitled to know exactly what happened. If they don't tell me, I'm switching CDNs as a matter of priority.|

If you are a hardcore user of their vcl on the edge I'm very curious where you would go to. The last time I looked ( a year ago ) there was no one that came even close to giving customers that level of control in request processing. Most of them fail do complicated stuff with CORSs without doing arabesque while balancing on a medicine ball ( Looking at you Lambda@Edge ) not to mention ability to massage the response.

If they tell everyone exactly how to trigger the bug before they finish rolling out a fix, people will trigger it on purpose to bring down websites.