|
|
|
|
|
by jolynch
3518 days ago
|
|
I respectfully disagree. I'm all for root cause analysis and taking the time to fix things upstream, but I also think that it's easy to say that and hard to actually do it. Yelp doesn't make more money and our infra isn't particularly more maintainable when I invest a few weeks debugging Ruby interpreter/library bugs, especially not when there are thousands of other higher priority bugs I could be determining the root cause of and fixing. For context, we spent a few days trying to get a reproducible test case for a proper report upstream, but the issue was so infrequent and hard to reproduce that we made the call not to pursue it further and just mitigate it. I do believe that mitigate rather than root cause is sometimes the right engineering tradeoff. |
|
Now, given the context it doesn't matter whether or not the company or the product dies so I can see where you're coming from but in any serious enterprise that would not be tolerated, but when your code base already has 'thousands of other higher priority bugs' it's a lost cause, point taken. But at some level you have to wonder whether you have 'thousands of higher priority bugs' because there is such a cavalier attitude to fixing them in the first place.