Hacker News new | ask | show | jobs
by DCKing 2164 days ago
Self hosting is only best if you want maximum control. But most people can't handle that imagined level of control in reality, at least not on the same level as a major cloud platform. AWS (or GCP or Azure) infrastructure is overwhelmingly more likely to be able to do security better from a maintenance, intrusion handling or physical security perspective than any single organization can do, especially organizations smaller than AWS. The tradeoff is that despite how much AWS can claim to not touch your data, and all the feature, contract and compliance documentation they have to show for it, you can never be sure they're not touching it (deliberately or by accident).

Self hosting is not about confidentiality. For nearly all categories of "confidential data", I would much much rather have it in a major cloud platform than running in some closet somewhere or in some random colocation center somewhere, all other circumstances being equal.

Self hosting is about how much you want to be in control, regardless of your capabilities to actually be in control.

3 comments

> Self hosting is only best if you want maximum control.

Not necessarily.

Beyond a certain scale you can go build your own datacenter (or smaller: rent a whole rack cabinet in a datacenter) and start exploiting economics of scale.

A lot of people don't realize that nowadays you can pack tens of cores and literally terabytes of ram in a 2u server.

That's what I don't quite understand about the current state of cloud computing. We're seeing huge advances in hardware/network technologies this decade but there's an ever increasing push to centralize hosting with cloud providers. Will this ever swing the other way?
Look at what's driving the shift: data centers are a major capital investment up-front plus a significant amount of staffing to operate and secure them. If you have enough proven need to justify that, you can easily beat a cloud provider — especially if you can simplify the problem in some ways that a generic service cannot.

For most organizations, however, it's hard to justify investing millions of dollars up-front in the hope that at some point you'll be saving enough to make that pay off. If that's not your core business it's often easier and safer to outsource it so, for example, you don't end up with a data center full of 50% utilized hardware which you bought to have capacity for growth which wasn't quite what you expected — or a big crunch when you have more demand than capacity and now need to double that investment to handle [currently] 10% of your usage.

> For most organizations, however, it's hard to justify investing millions of dollars up-front in the hope that at some point you'll be saving enough to make that pay off.

Well if you have your bills and a prospect of how much building and operating a datacenter would be, it could be very easy to do the calculation.

Btw one should not dismiss so easily the work of datacenter companies. They often have very high security standards and practices.

And this means that you don't necessarily have to build a datacenter from the ground up. You can start saving by just renting one or two rack cabinets and start putting your own hardware in there.

Oh, I’m not being dismissive of their work - it’s just multiple lines of skilled work which you have to complete. The building, hardware, and software management all require 24x7 operations and security, work with vendors and capacity planning, etc. and overseeing all of that work.

At some level of usage those costs are lower than the savings but that line has been going up for years, especially for anyone who needs PCI, HIPAA, FEDRAMP, etc. where there’s a ready package available covering a lot of it.

Yeah especially if your company has more than one location - for redunancy
Dell PowerEdge r6525 - 1U server

- CPU: 128 cores, 256 threads (2 sockets)

- RAM: Up to 2TB RDIMM or 4TB LRDIMM (16 channels)

- Avg. power at 100% load: 750W

Standard rack size is 45U:

- CPU: 5760 cores, 11520 threads

- RAM: 180 TB

- Power: 33kW

You might need to sacrifice 1U or 2U for switches.

Current generation is so "cloud-happy" they dont appreciate the cost of being cloud-based... (10x? 100x? more?)

Large companies have been announcing HUGE savings, small companies would be able to save a LOT too... such a pity, all the cloud abstraction creates lazy teams IMO, and lazy companies... (again IMO, I know this wont be a popular view, because this audience is exactly the cloud-happy audiencem but if you achieve self criticism, self-hosting / colo etc is probably a better fit for 99% of cases)

It seems you're talking more about running costs, and not about any of the security aspects I was talking about?

I'll happily accept that you can pay less money for the same amount of power, but security isn't free. You don't only outsource a considerable amount performance and reliability engineering to $MAJOR_CLOUD_PROVIDER, but also a lot of security engineering. Doing a lot less of that is cheaper for sure, but is that worth the cost? I'd argue that for most (not all: most), it isn't.

At ever growing scales the equation will eventually tip in your favor, but you have to either be working at a very substantial scale for that, or you simply must not care for an important portion of the tasks that the major cloud provider picks up for you. That is fine by the way, but you have to be sure that that is actually a concious decision and you're not simply forgetting to actually do that work or doing it poorly.

I hope I don't sound combative, but "most people can't handle that imagined level of control in reality" don't seem very fact based. All companies used to self host, because there wasn't cloud. Many organisations still self host, like shared hosting providers, governmental institutions, banks. I have two counter arguments against "big cloud can do better":

1) There is often assumption, that people behind cloud services are smarter than anyone else and don't make mistakes. In reality, they still are humans. Big names attract some bright people, but not anyone is genius with good working in team skills.

2) Cloud companies have much harder problem to solve. They have two hostile fronts, outside world and clients. They need protect themselves from malicious clients and keep clients separated. They offer generic services for everyone, so there is unused functionality for your use-case. You can disable/uninstall things in your self hosted setup (infra as whole, not meaning inside your virtual machine), filter aggressively in network perimeter and so on.

Self host don't only mean "running in some closet somewhere or in some random colocation center". I can't say, how things are done in US, but in my country government has several DC-s/server rooms for governmental agencies. There is on premises hosting too, sometimes with very good physical security.

Bear in mind that while cloud providers can be more technically sophisticated with their security, they are inherently less secure from the get-go than a box on premise: Because by default, the cloud provider must be configurable across the Internet from anywhere in the world, and my self-hosted box, by default, can only be configured by a mouse, keyboard, and monitor physically plugged into it.

From that point, yes, you open up your self-hosting to the world in a (hopefully) limited fashion and restrict access to your cloud management (hopefully) to a much narrower scope. But by default, a box in your building starts completely secure, and your AWS box starts accessible to anyone on the planet with your AWS password.

While it's true what you say, the underlying assumption is the general approach of "my network is secure and keeping track of what goes in and out is something I control". I won't comment on how it applies to your situation, but I do think this is assumption is outdated castle-and-moat security thinking. This assumption also falls apart on closer inspection in 99% of the situations and in my experience especially so in cases where people bring up this assumption.
Large companies with very decentralized infrastructure (who also profit off selling clouds in many cases) promote zero-trust infrastructure models. This is predominantly based on what works for them (having hundreds of offices or large amounts of remote staff), and of course, a must to sell people on if you want to sell cloud services.

Zero-trust is not without merit, by any means. It is good to not assume there are no cracks in your walls, and you should indeed use as much internal security as possible wherever you can.

But you know what's really quite silly? Deciding to fill your moat in with dirt and knock over your castle wall because you think it's possible for someone to get in anyways.

You had better believe I'm going to use the latest authentication and encryption tools between machines that I can to ensure nobody can listen in from a stray network connection... and that I'm also going to put all of it behind a firewall.

Yeah, lock your doors inside your castle, but for heaven's sake, the moat and the castle walls still help. Defense-in-depth is a concept I swear everyone forgot when clouds became a thing.

Ah yes, but now you're describing requiring you to set up engineering regiments within your castle, some portion of which you could outsource to your cloud provider in the alternative situation. My point is that this security boundary that you posit as an advantage in your original post is simply not very interesting (and often harmful) on its own.
Cloud providers aren't very intelligent security layers.

My favorite example is my Google Voice account. It has a different area code (out of state) than my real phone number. I get a lot of spam calls, almost all through Google Voice, and I know not to answer them, because nobody legitimate calls me from the area code my Voice number is from.

Google has state-of-the-art artificial intelligence and spam filtering capabilities. It's arguably the two most sophisticated advantages Google has. And it is completely ineffective at blocking spam calls. If Google Voice gave me the ability to create my own filter rules, I could write a one-line rule that would drop any call from that one area code, and I would have perfect spam filtering for my account.

This isn't an example about Google Voice, but about the difference between generalized technologies that cloud providers use versus configurations you can apply yourself that are custom tailored. Obviously, Google can't block everyone in that area code as a spam filtering method... many people legitimately have that area code. But for my phone, it would be a good rule and would be nearly 100% effective.

Which is to say, my engineering regiments will always be more capable than my cloud provider's engineering regiments, because mine know my system and my customers and my use cases. I'm paying engineering regiments either way, so I might as well pay my own.

> Cloud providers aren't very intelligent security layers.

I think I lost track of what you're trying to discuss now. I'm not arguing cloud providers are a security "layer" in any sense, just that they take responsibility for some things you otherwise need to do yourself. If you got that from my post I apologize. Even if I said something like this, I don't know how your Google Voice example (which is an application/service) applies to cloud infrastructure.

> Which is to say, my engineering regiments will always be more capable than my cloud provider's engineering regiments, because mine know my system and my customers and my use cases.

Good for you if true, but I've personally never seen an environment where such confidence on the part of infrastructure engineers has held up. At least not from a security perspective.

> I'm paying engineering regiments either way, so I might as well pay my own.

If it turns out the equation favors you, then great, those companies exist. But I don't think the equation favors many, at least not when including all the items you need to have for self hosting.