Hacker News new | ask | show | jobs
by oneplane 599 days ago
It's not really about cloud vs. on-prem, it's the fact that people cut corners and lack knowledge on-prem, and don't have the budgets to do anything about it.

What you're referring to is mostly about elasticity, and it's true that if you don't need it, it doesn't make sense to pay for it.

But that doesn't mean that on-prem (which almost always turns into a virtual machine shitshow with crappy network design -- which will continue as long as nobody implements things like strong IAM and Security Groups in their on-prem setups) is 'the same' as cloud but just in a physical location you control.

The inverse is also true. If you just run some VMs 'in the cloud', you're doing it wrong. Playing datacenter is just as bad as not moving away from classic virtual machines, cloud or no cloud.

2 comments

So when they are setting up config files for the cloud they don't cut corners? It is insane amount of work to follow safe practices to configure your cloud.

I don't see that much difference compared to doing actual admin tasks.

The entire underlying layer of possible misconfigurations is absent in the cloud. Yes, the services on top of that can still be misconfigured, but you don't get access to hosts, SANs, switches, firewalls, gateways, there isn't anything for you to mess up. The shared responsibility model allows you to also pick even more robust options.

But even if you were to stick to something simple, say, object storage. A bucket or blob store has no SAN config, no webserver config, no switches, no gateways, no raid controllers, no striping, mirroring, parity configuration, no firmware, no BIOS, no BMC, no OS. None of that. It's all eliminated. All that remains is the top layer where you configure your cost-to-resilience ratio and your access policy. And yes, you could cut corners, but those are orders of magnitude fewer corners you could be cutting than if you include all the stuff below it.

Add to that: almost all of it has good APIs that are well defined, well supported and have an ecosystem to go with it. Try finding anything like that for a crappy NetApp or EMC appliance you find in a datacenter. It either doesn't exist, or it's so bad you might as well run MinIO or a bloody NFS share (not actual object storage) yourself.

Being bad at cloud is definitely more expensive than being bad at on-prem, I'll give you that. But with cloud, at least you get a bill that you can use to show your peers and higher ups that being bad has a cost. Internal virtual/amortised dollars are much harder to allocate to incompetence. It's often completely ignored, and at best revisited at periodic capacity planning reviews with few to no consequences.

The only place on-prem has, is with locality requirements. That includes latency sensitive things where sub 1ms is a goal, and air gapped things. But even in the first case things like an AWS Outpost exist, and those are cheaper than doing it yourself (not much, but enough to save on the hardware and on 2 FTEs).

My friend some the biggest data leaks happened because of misconfigured S3 buckets which is literally one line of code to get right.

Cloud is not an insurance against incompetence.

And it opens you up to potential exposure due to mistakes at the cloud provider.

About two years ago we got an email from AWS associated with a PHD notice. It “apologized” for an issue whereby the EC2 Security Groups in a single AZ were in place but not operative. All traffic was permitted for several hours, irrespective of the SG config.

We deploy and align host-based firewalls alongside whatever the cloud provider gives us, for exactly this reason.

Somewhere along the line “the cloud” seems to have gotten a reputation for some level of infallibility of which I’m not convinced.

See also the recent problem where Entra logs weren’t captured for some tenants, and are just gone.

I didn't mention there were no leaks or is no incompetence. I wrote about the amount of corners that are no longer available to be cut. Corner cutting isn't exclusive to data leaks. It impacts everything, mostly the people actually working on the stuff.

Taking away responsibility from the people or departments that clearly can't handle it, that is what this means.

It does not mean that the responsibility that remains suddenly does no longer end up with incompetent actors. It just means it is now smaller, and smaller to a degree where it is very much worth it in most cases.

And just like I wrote earlier, there are cases where that works the other way around as well, and that just reinforces my point.

> The entire underlying layer of possible misconfigurations is absent in the cloud.

This is true.

Let's not forget there is a whole new, quite different, layer of potential (and easy) misconfigurations that exist only in the cloud, so it balances out.

When you can accidentally expose services with a single mouse click where it used to take someone with access to the server room going in and grabbing a cable and wiring it wrong, this category of problem is a lot more common now.

There is a middle era between a cable in a datacenter and a misclick in a cloud. Currently, on-prem is still 1 misclick away from accidental exposure (unless it's been untouched for 20+ years).

Be it with a legacy DMZ setup or a bit more segmented with a ADC/Proxy policy that is slightly too wide. You can make those exact same mistakes with a stack of PaloAlto/Cisco/F5/IIS.

Unless you're running an entire OpenStack setup with SDN layers and policies (hit: most on-prem setups don't), there is a crapton of re-use when it comes to systems, and a classic webserver that used to be just for public stuff will just as much have some private applications added 'temporarily' (read: forever) and a crappy WAF / Proxy rule that is supposed to deny public access but gets bypassed with a simple URLEncode.

Doing the lower layers requires knowledge and dedication, of which the first is getting harder to find (not easier) and the second is getting squeezed out of most processes since it isn't something that gets quantified as value.

So no, it doesn't balance out, and no, the cloud doesn't do a magical new layer of things that on-prem couldn't do, even if on-prem usually fails to deliver on an abstraction layer (while the cloud does have it). A cloud does make it much more visible, cost-wise and impact-wise, because you can't hide in a cloud. What goes into a cloud API also comes out of the cloud API, there is no network scanning and hoping you find all hosts and appliances, everything that exists can be queried, and also gets billed with plenty of detail. On-prem has none of that, and the last 30 years of inventory/asset management attempts has proven that it's still something most on-prem setups don't do at all, or do a really crappy job at.

That's really what some/most companies want, a platform that can run cheap, fast and easy VMs, like on-prem, but without the hassle of having to deal with the hardware and physical network part, like in the cloud. Sadly that's not the choice being offered.

I don't know, I've seen the shittiest stuff built on-prem and in cloud, and I've seem completely amazing on-prem infrastructure and cloud stuff that could not possibly be built outside AWS.