| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Chris_Newton 1402 days ago

You know developers who've never let a bug or performance issue enter production(with or without testing)?

One of the first jobs I ever had was working in the engineering department of a mobile radio company. They made the kind of equipment you’d install in delivery trucks and taxis, so fleet drivers could stay in touch with their base in the days before modern mobile phone technology existed.

Before being deployed on the production network, every new software release for each level in the hierarchy of Big Equipment was tested in a lab environment with its own very expensive installation of Big Equipment exactly like the stations deployed across the country. Members of the engineering team would make literally every type of call possible using literally every combination of sending and receiving radio authorised for use on the network and if necessary manually examine all kinds of diagnostics and logs at each stage in the hardware chain to verify that the call was proceeding as expected.

It took months to approve a single software release. If any critical faults were found during testing, game over, and round we go again after those faults were fixed.

Failures in that software were, as you can imagine, rather rare. Nothing endears you to a whole engineering team like telling them they need to repeat the last three weeks of tedious manual testing because you screwed up and let a bug through. Nothing endears you to customers like deploying a software update to their local base station that renders every radio within an N mile radius useless. And nothing endears you to an operations team like paging many of them at 2am to come into the office, collect the new software, and go drive halfway across the country in a 1990s era 4x4 in the middle of the night to install that software by hand on every base station in a county.

Automated software testing of the kind we often use today was unheard of in those days, but even if it had been widely used, it still wouldn’t have been an acceptable substitute for the comprehensive manual testing prior to going into production. As for how the developers managed to have so few bugs that even reached the comprehensive testing phase, the answer I was given at the time was very simple: the code was extremely systematic in design, extremely heavily instrumented, and subject to frequent peer reviews and walkthroughs/simulations throughout development so that any deviations were caught quickly. Development was of course much slower than it would be with today’s methods, but it was so much more reliable in my experience that the two alternatives are barely on the same scale.