|
|
|
|
|
by PaulRobinson
299 days ago
|
|
I remember reviewing some code of an engineer I was managing at a FAANG. Noticed an edge case. Pointed out I thought if/when that hit, it was going to cause an alarm that would page on-call. He suggested it might be OK to ship because it was "about a one in a million chance of being hit". The service involved did 500,000 TPS. "So, just 30 times a minute, then?" And you're right about the amount of engineering that goes into solving problems. One service adjacent to my patch was more than a decade old. Was on a low TPS but critical path for a key business problem. Had not been touched in years. Hadn't caused a single page in that decade, just trudged along, really solidly well engineered service. Somebody suggested we re-write it in a modern architecture and language (it was a kind of mini-monolith in a now unfashionable language). Engineering managers and principals all vetoed that, thank goodness - would have been 5+ years of pain for zero upside. |
|