Hacker News new | ask | show | jobs
by cpplinuxdude 2873 days ago
That’s why you need good humans who’re expert at responding extremely well when things break. It’s one thing to prep for a whiteboard interview, it’s another to intuit that needle in the haystack.
2 comments

It's only about finding needles in haystacks if you don't have sufficient instrumentation and monitoring in place. For most production issues your tools should be able to guide you at least the first 75% of the way to your issue, even if they can't usually offer a good fix (though sometimes they can, such as when new relic points out missing db indices causing unnecessary full table scans).
I agree. It's also about trusting developers' intuition when debugging problems. We recently (past few working days) went through something similar; where we had a problem blocking us from a release, and people scrambling to figure out the problem.

We have some software that was returning different results from different environments, and we couldn't figure out the problem. There was a lot of panic in the room, from upgrading and downgrading Maven dependencies, building things inside and outside of Jenkins, and all sorts of random things.

We kept telling the project leadership that we're poking at the wrong part (intuitively), but they kept pushing. I've had to explain how Maven works, how building on Jenkins doesn't differ to building from our IDE's, etc.

It's only when we asked for isolation from the (human) elements, that we had the freedom to properly debug.

In the end, an unstable sort was the cause of the issue. We were taking the last element from an array, but not sorting the array first.

All of the stuff we did since last Thursday to Tuesday evening didn't help us.

So, I agree, you need good humans who are good at responding well when things break.

I've been in situations where the whole developer team is trying to solve a production bug, and the managers have never tried to tell us where to focus.

That wouldn't make any sense at all.

Every developer would have their own guesses that they would need to explore and validate, sometimes there's some grouping around where the focus is, but there's usually one guy that's exploring a totally different area to find that bug.