Hacker News new | ask | show | jobs
by cvrjk 2016 days ago
Welp, as a new grad there, I had brought down one very important database server on a Sunday night (a series of really unfortunate events). Multiple senior DBAs had to be involved to resuscitate it. It started functioning normally just a few hours before market open in HK. If it was any later, it would have been some serious monetary loss. Needless to say, I was sweating bullets. Couldn't eat anything the entire day lol. Took me like 2 days to calm down. And this was after I was fully shielded cuz I was a junior. God knows what would've happened if someone more experienced had done that.
3 comments

I brought down the order management system at a large bank during the middle of the trading day. The backup kicked in after about a minute but it was not fun on the trading floor.
I'm so glad I'm not the only one feeling deployment anxiety. The project I'm involved in doesn't really have serious money involved, but when there's a regression found only after production deployment my stress levels go up a notch.
When I was working at a pretty big IT provider in the electronic banking sector, we (management and senior devs) made it an unspoken rule, that: - Juniors shall also handle production deployments regularly. - A senior person is always on call (even if only unofficially / off the clock). - Junior devs are never blamed for fuckups, irrespective of the damage they caused.

That was the only way to help people develop routine regarding big production deployments.

Same thing -- used to work at a very large hosting provider. One of our big internal infra management teams wouldn't consider newhires fully "part of the team" until they had caused a significant outage. It was genuinely a right of passage, as one person put it, "to cause a measurable part of the internet to disappear".

I got to see a lot of people pass through this right of passage, and it was always fun to watch. Everyone would take it incredibly seriously, some VP would invariably yell at them, but at the end of the day their managers and all their peers were smiling and clapping them on the back.

Sounds like hazing.
as a new grad there, it wasn't your fault. There should be guardrails to protect you.
Yep. It was supposed to be a very small change. I blundered. My team understood that and was super supportive about it all too. But this was after it was all fixed.

During the outage though, no one (obviously) had time for me. This was a very important server. The tension and anxiety on the remediation call was through the roof. Every passing hour someone even more important in the chain of command was joining the call. At that time I thought I was done for...