|
I was sitting in an escalation-call, called by one of our most important customer and India’s largest ISP.
There were 13 of them, 3 of us, and only 1 technical (that’s me).
Things were super intense, extremely heated debates were going on, I was fixing a few issues, while answering the questions coming my way, in parallel I was also chatting with my teammates, who were back in the office and were on support calls with the same client’s ops team.
And, then it happened -
rm -rf /var/lib/mysql (i was already sudo su) Escalation got further escalated - “Dashboard is not opening” For a moment, I was shell-shocked, couldn’t hear a thing, just sitting there frozen. Then i remembered the backup, i, against my usual style, took and the mysql replica i hosted, just as an insincere effort to calm the people in the room, i got to work. Restored the database, re-ram a few etl jobs, and we were back up online. I was relieved, and actually quite happy, started interacting with the people in the room again, and showed them the system robustness even under a disaster, two of the 13 caught my bluff, but were smiling, They winked, turned back to face the others, and the heated debate continued. The escalation went up till the CIO, but the guys who caught my bluff, never gave my deed away (in return i had to code a few more features and reports, just for them). Don’t multitask, especially with a sudo access. |
The guys who caught me were the actual users of the software, their team actually.
Saves them a millions of $ per annum, because of being able to meet SLA.
It’s now running for 7 years in their NOC.
Users revolted against their CIO (second or third replacement CIO), who proposed replacing this software (and others in their NOC), with an HP unified monitoring suite(a part of CIO’s digital transformation initiative). Apart from this software, rest of the monitoring systems have been replaced.
:)