| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kochbeck 3996 days ago

Heh, yeah, as someone who still occasionally advocates for converting something back to batch, I know. Bear in mind, I'm typing this from an XOpen-certified UNIX machine, so obviously I wasn't a total rwars ideologue (religious OS wars for those too young to remember the term).

The main issue was job completion predictability - most things we do with computers are fundamentally batch, and almost all the really, really important ones like bank account daily settlement and reconciliation are totally batch. There's simply nothing to be done while you wait for the process to complete nor anything of higher priority that you'd want to preempt that task. So the question is, if the task is business-critical important or if it's critical to major institutions such as the global economy - like, say, the Depository Trust Corporation's nightly cross-trader settlement process which is, in fact, still a mainframe batch - why would you want the process to be anything other than a deterministic length of time for a fixed input? You'd be willing to commit a whole piece of hardware to getting the job done, right? As it turns out, that's the reason. There are an awful lot of things that are more important than economical full-utilization of a machine, and most of those tasks are still carried out on mainframes, and usually they're still done in batch.

There are a bunch of secondary reasons as well, though: a 3270 terminal ran in the thousands of dollars a unit in the 1980s; the network was really, really slow, and sharing the terminal server was worse than slow; if you were lucky(?) enough to have a token ring desktop and CM/2 on your machine so you didn't need a 5250 death-ray CRT next to you, you were unlucky enough to be on token ring and good luck with that; at 9am when the world woke up and logged in, the entire SYSPLEX ground to a halt waiting for all the interactive logins to complete, even though folks would then idle most of the day... on and on and on, and all of those were issues with time-sharing systems that, for most applications, worked just as well if you punched a record card (I know, right? Punch cards...), put it in a stack, and handed it off to the data processing department at 5pm.

If I still had $X billion in transactions to clear a day where X > a number that would get me jail time if I screwed up, I would probably still do it on a zSeries mainframe running CICS and IMS but running almost totally in batch. Because why chance it?

3 comments

nickpsecurity 3996 days ago

I agree on great perspective. I'll further it by saying there's a ton of uses for batch and even for utilization. Any job that is a time-consuming hand-off from user or developers might be most effective in a batch run. The reason is that, esp w/ I/O processors, the system keeps running without swapping CPU state, cache, or memory pages. Highly efficient for given task even without a time requirement.

Personally, I think the better and modern take on it is a compute cluster where certain nodes can be brought up for dedicated, batch runs while others run interactive functionality. The embedded safety and security scene have been trying to do it with the partitioning, MILS kernels that strictly separate and schedule workloads based on priority w/ fault-isolation. Recent ones allow resource donation by partitions that are done so waste is minimal. Finally, there's security benefits in that batch runs make it easy to eliminate covert storage and timing channels. Hell, you can even do what I did (and cloud is just now doing) in designing a custom OS image per batch app to load for that on a minimal kernel. Reduces resource requirements and problems.

link

marktangotango 3996 days ago

Great perspective, not many people today have experienced the shear magnitude of mainframe batch workloads. The other thing about timesharing on the mainframe was system stability. IBM went to great lengths via cics to totally lock down stateful, bidirectional interaction with the mainfrme. Why was this? I've always wondered, why not time sharing?

link

MichaelGG 3996 days ago

>not many people today have experienced the shear magnitude of mainframe batch workloads

Can you elaborate? For instance, SWIFT does like 15M messages/day (according to Wikipedia). That's...really not that much in absolute terms for even a cheap server today.

link

tolle 3995 days ago

But theres a ton of bookkeeping and checks to be done for every transaction. Due to regulations and so on. But yes it's not an insane amount of transactions.

link

marktangotango 3994 days ago

Note I specified 'batch'. One mainframe app I'm familiar with handled > 50% of mutual fund transactions in the US. Including 401k, this could be > 100 million transactons per day when follow on broker/dealer cascade transactions are accounted for. Each transaction required non trivial amounts of processing, reporting, and logging.

Also note that a 'transaction' in this sense is not a http request response. The way it's used in mainframe systems it's a business transaction which can include 100's of smaller 'transactions'.

Banks and other financial systems can't perform daily reconciliation until markets close and stock and fund prices are known. Hence these systems store everything up to process in a nightly window.

link

kjs3 3995 days ago

I hear you, brother. I was doing time on Crays during the transition from COS (batch OS) to UniCOS (Unix based multiuser OS). We bitched about what a pain it was to do the coding and marshall the data on front end VAX/VMS machines, then submit it to the COS queue and wait for the results. Then we got to all share the same machine for everything, and we all went from "this should take about X time to run" to "fuck all if I know how long this is going to run".

Minor nit...we found token ring degraded vastly better under load than ethernet. While 10Mb enet was faster bursting than 4Mb TR, the aggregate utilization for TR was better and more deterministic. Maybe your SNA folks were oversubscribing the ring. But, yeah, pretty much everything in the IBM ecosystem was 10x the cost of the emerging ethernet world and that pretty much doomed it.

link

kjs3 3995 days ago

Oh, yeah...and we figured out that most network traffic was, in fact bursty. So another nail in the coffin, as it were.

link