Hacker News new | ask | show | jobs
by tzs 4355 days ago
Ahhh....the days of sharp tools, no failsafes, and young programmers or admins.

I recall that time I wrote a batch manager for the VAX 11/780 at Caltech High Energy Physics. It consisted of a program to monitor the batch queue and start jobs as scheduled ("BATch MANager", or "BATMAN"), and a program for users to submit jobs ("Run Overnight Batch INput", or "ROBIN").

The configuration file for BATMAN was stored in /etc/batman.

During development, I occasionally had to "rm /etc/batman". Of course, out of habit, as soon as I typed "/etc/" my fingers would automatically type "passwd", and once I did not catch this in time. Oops. It happened to be a Sunday morning at around 7AM, and I had to call the other admin, who handled backups, to come in and restore that file. He was annoyed.

The second time I did this, he was pretty pissed.

The third time, I fortunately had been working at the terminal we had in the machine room, and managed to shut down power to the machine before the write buffers were flushed, and the file was OK after fsck. I didn't have to deal with an angry co-admininstrator that time. Just angry physicists.

The other admin (Norman Wilson, in case anyone knows him or he reads HN) then made a link named /etc/safe_from_tzs to /etc/passwd to stop my nonsense once and for all.

That worked until the first time I wanted to overwrite /etc/batman instead of rm it.

That led to a cron job that maintained a copy of /etc/passwd in a separate file, and periodically checked to see if it were missing or misformatted, and restored it if so.

1 comments

One would think after the first two times you'd find a better way to do this, realizing your infrequent but habitual mistake. Why didn't you change any of your practices after the first two screw ups?
Not speaking for tzs, but back in the good ol' days, everybody was pretty busy. A lot of software got written by operators a little bit here and there in between running jobs and moving paper around the building and that sort of thing; paid programmers were frequently dealing with change requests from business departments, all of whom wanted their thing done yesterday; and depending on the size of the organization, there might be a PFY or two, but they generally weren't allowed anywhere near production hardware.

For example, one of my early jobs in IT involved running batch programs that produced reports on a mainframe designed for the punch card era. It had moved on from punch cards, but all of the batch jobs still expected them as input, so they were stored instead as "digital cards" in the job files themselves. The operator -- me -- would be responsible for bringing a job up on the terminal, changing each occurrence of some two-letter code in each card file to some other two-letter code, some date code to some other date code, and so on. Each batch job might be just one step of half a dozen or so required to produce paper printouts from the database. The terminal emulator did not have a find & replace function. Naturally, I screwed up jobs on a regular basis.

This mainframe ran mainly on COBOL74. Over the course of a lot of unpaid overtime, a few hours here and there for several weeks, I gradually wrote a variable interpolator in COBOL that could be called as the first step of a batch file and would replace all occurrences of a variable tag with an input parameter passed to the job. Instead of pulling up a job file and replacing a bunch of two-letter codes, you'd just run the job with the two-letter code as a parameter, and this program would rewrite all of the data cards in the batch file. COBOL has no string operators or a string data type, but I found a way to abuse some system calls to make it work.

So it took weeks to fix the most common operator error in that shop.

IT staff spend more time on Facebook, Reddit, HN, and online gaming now than we ever had available for fixing processes back in the day.

I'm not absolutely sure on the number of times. It is possible that I only rm'ed it twice, and then Norman made the link to end that. This would have been around 1981, so there has been some memory fade.
are you that angry co-admininstrator ? just curious
hahaha :) I didn't mean the comment to sound angry - nope not the co-admin. I do have a running interest in stuff like continuous improvement, organizational excellence etc. so consider this field research!