|
In 2003, I was working for a startup in India doing GPS/GSM based vehicle tracking system for fleets of trucks. The trucks would have our unit installed in them, and they would use GPS to get the location and send it to our server via GSM text message. Back then, GSM coverage not good, and trucks would go out of coverage for days. To further complicate matters, our firmware used to crash and the unit would stop sending updates. To help us troubleshoot this, my boss asked me to program the unit to give a missed call to the server every hour. If we got a missed call, we knew that unit was still working. In countries like India, giving a missed call is a zero cost way to communicate. For example: You would pull up in front of a friend's place and give them a "missed call" to let them know that you are waiting outside etc. Anyway, I implemented the logic and we sent off our field techs to intercept trucks at highways and update the firmware. The way I implemented the logic was the unit was to call our server's modem number every hour at the top of the hour. No random delay nothing. So, soon after that, around 50 units tried to call our server at the same time. Remember the clocks in the units are being run off GPS and they are super accurate. This caused our telecom company's cell tower BTS to crash. Cell service in my office area, a busy part of Bangalore, was down for a whole 2 hours. I was called into the telecom company's head office for their postmortem. They didn't yell at me or anything. They were super nice. In fact, when I finished explaining my side of the story, one of their engineers opened his wallet and gave a hundred rupees to another guy. Guess they were betting on the root cause. From what I understand, they escalated the bug to Ericsson who manufactured the BTS and got it fixed. For my part, I added a random delay and eventually removed that feature. |