| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by heelix 1918 days ago

For us, one of our junior Devs managed to wipe out all environments, all datacenters for one of our Elastic instances. They got handed a task to modify an index. The Dev Lead and Sr. Devs were 'too busy' and she stack overflowed how to do it and it was rubber stamped. What she did worked - but it wipes out all the existing documents when she dropped and recreated the indexes. Issue was it was a large enough collection that it was not apparent that bad things were happing to the documents on the system. She scripted it up once she thought her test worked and everything went away as the automation ripped through systems. Three days later we had everything restored. The good news is the older system that the new release was replacing was still operational, so we did that.

A few learning experiences. Elastic was brand new to our mix, so not a lot of domain knowledge there. We discovered how dangerous a handful of curl commands could be with the 'stock' permissions the developers had and fixed that. Also became a nice conversation on code reviews, signoff, and deadlines. Also about actual DR readiness and how long it takes to actually restore. We got a lot of value out of that mistake.

She is still my favorite developer. I'd steal her for my personal team any day had she not been stolen from our group by another a couple years later. (shakes fist) As one would expect, she apologized and learned the lessons. She was one of the youngest we ended up giving root to a 12B document prod instance because I knew she would do it carefully and correctly.

1 comments

raverbashing 1918 days ago

Nice story.

Your dev lead/senior engineer should have never been "too busy" for letting someone do that on a first day unsupervised

link

llampx 1918 days ago

Never too busy to fix a fuck-up that could have been prevented.

link

heelix 1918 days ago

She had been around for around a year when that happened. The 'first day' story linked by Reddit was someone else. It was her first time modifying an index on ES, however. I include the scrum master and PO to be complicit, as resources were tight and they still pushed for the work to be completed. This one bit us hard. The team was broken up and moved on. She was the only one I personally asked to have as a direct report.

link