Hacker News new | ask | show | jobs
by genmud 859 days ago
> Mainframes are unbelievably powerful, feature-full, and cost-effective.

Maybe things have changed in the last 10 or 15 years, which is the last time I had a legit mainframe at a day job, but back then none of those things were true if you looked at it more than 20 minutes.

I seem to remember nothing being included in the base mainframe… when you started to add things like DR and data duplication and virtualization, it became extremely expensive. Like on the DB side, effing Oracle was much cheaper.

3 comments

Software licensing is expensive on them, but the people to keep your 99.999% uptime on your Kubernetes cluster aren't cheap either.

And software licenses are one of the reasons why LinuxONE machines exist - they don't run z/OS, so you don't pay those licenses. You can even start a dozen VMs under an LPAR and run your Kubernetes cluster as if it were running on more common hardware that just never, ever fails. IIRC, you can run a special version of z/VM to manage your Linux VMs if you don't want to run Linux on the LPAR and use KVM for your VMs.

When I moved from a tech company to an aero company who used mainframes, there were literally hordes of IBM contractors who helped maintain the mainframe environments. Getting anything done in those environments was a multi month project, even for basic stuff like patching the OS. There were probably 2 or more sysadmins per rack, employed to just keep the damn lights on those boxes.

For context, there was about 1 admin for every 400 servers at the tech company, and that was for the entire tech stack (LAMP).

Mainframes require a lot of care and feeding. In my experience, having worked at 3 different companies who relied on them (education, aero and finance), their capex is higher, their opex is higher and their uptime / reliability is entirely dependent on the facilities / staffing.

You don't think it requires some expensive mainframe admins, even with the fancy software? Anywhere I worked with a mainframe, the mainframe admin was highly distinguished within the org and extremely knowledgeable.

I would argue Kubernetes talent is cheaper than mainframe talent these days because it's ubiquitous.

I think the rough premise is there's a 1:>3-5 ratio of "super knowledgeable mainframe guy" to "super knowledgeable kubernetes guy".

Also, Kubernetes talent is far from ubiquitous. That sounds more to me like you're counting anyone who has successfully deployed one time.

Yeah, I am that super knowledgeable K8s guy, but I’d still say true mainframe admins are still a higher metric.

But in regards to K8s talent, lots of the people who think they know Kubernetes in production but have never had to actually upgrade the cluster, or go through the process of having to update manifests, deployments, and CSIs, and having to actually deal with api removals.

Agreed. The tooling around upgrades is painfully atrocious, and stuff like kubepug [1] should be part of the Kubernetes core.

[1] https://github.com/kubepug/kubepug

Why bother using an LPAR if you're just gonna use kubernetes anyway? Why not just use one fat machine?
For many generations now, you can't run Linux on bare metal anymore, it has to be inside an LPAR. I think this is true for z/VM and the other operating systems as well.
Why bother using a mainframe LPAR for Linux at all? In most cases, it makes much more sense to run it as a guest under z/VM.
LPAR isolation happens on a lower level than z/VM or KVM. I don't think anyone has ever demonstrated a successful LPAR escape attack.
It may be implemented in the system firmware, but it's still a hypervisor performing context switches and enforcing access to pci devices. Even if you've never looked at the processor architecture manuals you can tell this is what's happening when you can assign 0.1 cores to an LPAR. Different implementation details but the same functionality as SPARC LDOMs and Intel's vt-x & vt-d.
The lack of escape demonstrations are likely, at least in partv due to a fairly low availability of those systems to the security researchers.

I do not want to make it see that LPAR isolation is just waiting to be compromised, but security-by-unavailability also plays a part :)

OTOH, the technology has been in production for decades.
It took many years for Spectre and Meltdown to be discovered, and that was for CPUs affordable for individuals.

How many security researchers are even familiar enough with the concept of a mainframe to consider looking for an LPAR breakout, let alone have access to the necessary hardware?

"... some mainframes have models or versions that are configured to operate slower than the potential speed of their CPs. This is widely known as kneecapping , although IBM prefers the term capacity setting, or something similar. It is done by using microcode to insert null cycles into the processor instruction stream. The purpose, again, is to control software costs by having the minimum mainframe model or version that meets the application requirements."

https://www.ibm.com/docs/en/zos-basic-skills?topic=concepts-...

I'm lost in this thread.

This whole time I just thought mainframe was an older style word for a large rack based server or server room. Like 'cloud' storage for someone's computer running miniserve.

What is the real difference between a mainframe and, e.g. a rack full of H100s, or rack full of 100GBps networking stuff, or some nice stack of 12x blades with 8x 256 core CPUs?

Why or how does a "mainframe" have more power than that?

Mainframes deliver reliable uptime. They so far are the only ones that can do it reliably, for decades.

Essentially the bunch of boxes model(k8s being the new kid on the block) has been trying (and mostly failing) to provide what mainframes have been providing for 60+ years.

Which is being able to treat your workloads as just a random virtual job you can push wherever and let it run while also giving you ridiculous uptime.

Mainframes are basically the hardware infused uptime deliver machines. They can and will offer 5 9's without any trouble. AWS, Azure, Google's cloud, none of them can deliver that amount of uptime, they ALL have failed repeatedly, so much so, that they purposely try to obfuscate their downtime records. Many don't make any historical data available.

k8s and the like have been trying and failing at reliable uptimes. Sure we've arguably been making some progress, but your average self-hosted k8s team has full-time dedicated teams of people that do nothing but babysit k8s. How many staff do your average mainframe org dedicate to keeping the mainframe alive? Usually 1 person, maybe two. Of course the price you pay IBM or whoever you choose as a mainframe provider will help offset the staff savings from your k8s team :)

It's not about raw compute power. It's about keeping a workload alive for as long as you can deliver power to the mainframe. i.e. the Mainframe promises to deliver uptime for as long as you can keep power to the machine. As parts fail, seriously any part: memory, CPU's, disks, backplanes, it doesn't matter. Mainframes can route around the failed part and you can replace it without turning anything off or affecting your workload. This means your mainframe is sized larger than your workload of course. It's not like the fundamentals of compute change in that regard.

The question then becomes, is the juice worth the squeeze? If your entire business model requires uptime, then you best really, really care about uptime. There is a reason the Visa and Mastercard networks have basically never, ever been down. It's because they know their business only exists as long as their network works. When you want uptime at all costs, you don't run k8s(or whatever the latest craze is tomorrow), you run a mainframe.

Most of us get more uptime than we need with insert favourite cloud provider here. Uptime isn't something they actually sell when you read the contracts you sign. Uptime is just marketing spam.

> They so far are the only ones that can do it reliably, for decades.

AFAIK VAX's are still around, though good luck trying to buy a brand new one. ;)

VMS still exists and is ported to x86! :)

I should point out, I was talking about mainframes in general, to include Vaxen, Fujitsu, Unisys, etc. IBM isn't the only mainframe game in town.

The term "mainframe" has never included Vax's. Not sure why you're trying add it now. ;)
The VAX9000 was the end of their mainframe line of VAX machines: https://en.wikipedia.org/wiki/VAX_9000
You oversell mainframes, not sure why you're mentioning k8s but Google which brings 10x more money than visa and MasterCard combined does not use mainframes at all but k8s or equivalent and yet they don't have downtimes either.

With k8s on a major cloud provider you get 99.99% for what? 150$/month? And there is no maintenance what so ever.

Full team managing k8s is a lie, even on prem, what exactly there is to manage on a daily basis?

You are talking about hosted k8s, which is not what I'm talking about. If you have never worked on a on-prem k8s team, you are totally missing out! :)

k8s is brittle, all the other competing tools are also fairly brittle, so it's not like k8s is really alone here. This is why we keep replacing the newest mess of virtualization every once in a while, someone eventually gets fed up with whatever the current crap is and writes a new one. It becomes popular, breaks in new and unique ways, rinse and repeat.

k8s is not even a decade old at this point. Mainframes have been around with 5+ 9's of reliability for 6+ decades.

I'm not overselling what mainframes provide. I think you just have selective memory.

99.99% is not what Mainframes provide, they provide many more 9's than that think 7+.

Most of us don't even need 99.99% uptime, so most of us probably shouldn't be buying mainframes. If you DO need severe uptimes, then you either have a huge oversubscription and dedicated teams of people around the clock babysitting things or you buy into mainframes. You absolutely don't buy AWS or Google Cloud or Azure and say, that's good enough, because their uptimes are just marketing speak, not reality.

Even if it were true what you're saying, z16 for example targets different needs. Can you scale up to seven or eight nines, and even if you could - could you for same price? That's 3 seconds or 300 milliseconds of downtime per year, reliably. Along with predictable high performance (vertical), hotswap anything including memory, resilience with hardware under provisioning, etc. That's what these beasts are for.
> With k8s on a major cloud provider you get 99.99% for what? 150$/month? And there is no maintenance what so ever.

Gross understatement of costs don't you think? Last time I checked it was a base fee of 150$/month for managed k8s + whatever computing resources you end up using.

I think a lot of people might say only IBM sells mainframes - the “z” system.

It’s designed to work as a single high-availability machine with a focus on I/O speeds, rather than a PC cluster that implements high availability at the application level and communicates over IP. You can hot swap faulty components without turning it off. I haven’t ever used one, but I believe you write application code without worrying about such details - the hardware and the libraries you link against will just “make it work”.

Perhaps my knowledge is slightly off though.

Many companies also still rock Fujitsu or Unisys mainframes - IBM isn't the only player in this town. But at least Fujitsu's mainframe EOL is in 2035, so that won't be more than historical trivia soon.