Hacker News new | ask | show | jobs
by bitbuilder 1993 days ago
I wouldn't be surprised if it's actually a combination of a new feature being recently rolled out, along with the sudden spike in load this morning.

The holidays are actually the perfect time for Slack to roll out a risky deployment, as it has to be their lowest usage time. So it would make sense if something was pushed out last week or the week before. And everything probably seemed fine.

And then this morning they suddenly realize this new feature does not perform under load. And to make matters worse, the new feature has been out long enough to make any sort of rollback very tricky, if not impossible. Which means they'd need engineers to desperately hack out, test and deploy a code fix.

If this is the scenario, I do not envy them at all.

1 comments

Holidays are a good time for a company to do a risky deployment, but a bad time for an individual employee to do a risky deployment, assuming one doesn't want to work overtime over the holiday fixing things.
Depends on how well compensated holiday overtime is. There are some employees happy to work overtime if their hourly pay is doubled or tripled. However there also those who wouldnt do that for any price.
Depends how bad it goes wrong. My org is a 24/7 one, but one Christmas back in the 90s (way before my time) some work was done on Christmas eve, I think it was on the phone system, in the days before widespread mobile phones.

It broke, which was a major problem, this meant that senior management were being phoned (ho), and relatively high middle managers were on site to deal with the fall out. Of course most suppliers were also closed so everything was harder to fix.

There's good reasons not to do changes when places are closed, or at least skeletoned, for 2 weeks.

This depends on how easy/difficult the rollback strategy.