Hacker News new | ask | show | jobs
by moltar 1643 days ago
I think the mistake is often made by comparing primitives. E.g. running my own RAID vs S3. Colo traffic vs AWS traffic.

But what about comparing the whole ecosystem?

Can you provide a self hosted granular access permission to your RAID? How hard is it to configure and maintain?

Will your colo deflect a DDOS attack?

When you run your own services, you have to reinvent so much it doesn’t seem to be worth it.

3 comments

> Can you provide a self hosted granular access permission to your RAID?

This is the second-level mistake engineers commonly make:

The right questions isn't "Can you do X". Give engineers enough time and resources and they can usually come up with a solution to do X.

The real question is "How much time and resources need to be invested to accomplish X at a satisfactory level?"

And the third-level mistake is to assume that getting something to work once is the finish line. In practice, getting something to work once is just the beginning. Getting it to a maintainable, well-documented, repeatable state is a lot more work.

Cloud services make all of this effort disappear. Type a few commands and it's good to go. Now you can take all of the engineering hours that would have gone into the DIY version and allocate them to working on the company's product instead of reinventing architecture that you could have simply paid for.

Good engineers are scarce and expensive. Using them to reinvent infrastructure that can be trivially purchased for a nominal amount is a terrible move most of the time. Even when it does make sense, the right move is to build the prototypes on AWS and then consider transitioning to self-hosted later if the numbers work out.

Eh, I would argue that any advantage cloud has in ease of configuration is because of the brain drain in good server software caused by the cloud: Spend some time in Microsoft Azure and it becomes instantly sad that all this manpower was pulled off of Windows Server (which has stagnated as a product) and been invested in a proprietary service product that runs on top of Windows Server. And the former will outlive the latter more than likely.
You make a lot of claims without providing evidence.
> Can you provide a self hosted granular access permission to your RAID?

Yes.

> How hard is it to configure and maintain?

Very few things are harder to configure or maintain than they are on a cloud service, because if they were, someone (e.g. you) would get frustrated and make them easier, and then they wouldn't be for anyone else.

> Will your colo deflect a DDOS attack?

Ah yes, S3 can handle serving that many requests and keep everything online. But then don't you get a bill for $72 billion dollars?

I believe both Azure and AWS (probably GCP too) have built in DDOS mitigations for free.

https://docs.aws.amazon.com/waf/latest/developerguide/ddos-s...

You might be on the hook for bandwidth costs from a more sophisticated attack though.

That isn't very specific about how it works ("defends against the most common, frequently occurring network and transport layer DDoS attacks" whatever that means), but it sounds like they're going to drop weird looking packets.

The problem is, one of the more common types of DDoS is that the attacker has a botnet with a million machines in it and has them all make legitimate requests to your service all day, thereby overloading it. This looks just like a large volume of legitimate requests, because it is. S3 or similar isn't going to get overloaded, but then what stops you from getting a bill the size of the moon?

To do otherwise they'd either have to be able to distinguish these from legitimate requests (how?) or give you free traffic when you claim you were under a DDoS that they can't distinguish from a large volume of legitimate traffic (unlikely).

Why is this unlikely? If you do it several times then they will start to get annoyed and say no but a service like AWS is all about the long term customer relations. I've had bills of ~$1k refunded even though I'm a ~$3 p.m. user.
Waiving a bill for a thousand dollars isn't really costing them a thousand dollars because their underlying cost is much lower than that.

Do the math on how much the S3 bill would be if a million bots each with a 100Mbps cable connection would DDoS you for a month. A thousand dollars is too low by how many orders of magnitude?

You might get them to waive that, maybe, or maybe not. Even at their cost they'd never make it back from you. Do you have any guarantee that they will? What happens if they don't? What happens if they do it once, but the attack hasn't ended?

Umm. why wouldn't you run Ceph yourself? It speaks S3. (It has an component called RGW - Rados gateway, completely stateless, scalable, implementats the S3 ACL Policy xml)

And yes, running it has a cost.

But it's also has the advantage that devs can run it locally in docker easily. CI can spin up endless test clusters.

And so on.

Obviously you are right that the right way to compare cloud vs non-cloud is to look at the full picture. And that also means we need the context.

Small/hobby project? Doesn't matter. You can run on your own toaster or on Oracle cloud or on AWS/GCP/Azure. Just do what you want, the costs are negligible.

Operating business with stable well predicted size? Again, do whatever you want. If IT is a big part and costs matter, optimize for cost and run it on a few dedicated boxes. If you are not cost sensitive and you want to be one of the cool kids run it in AWS or whatever. (We have a client that exists for 25+ years, reached its optimal size, does some innovations from time to time, but it is basically a new website or app. The underlying backend is the same, maybe they'll replace it eventually. Probably with a complete SaaS and then they'll only need to host a landing page.)

Large multinational company with more departments than sanity? Again do whatever you want, likely you have bigger problems than the cloud bill or the inability to run one more app in your DC.

"Unicorn" startup? Crunch the numbers, do what makes sense. Everyone knows that "Netflix went full AWS" but maybe not everyone knows that they went full on-prem CDN more with their hundreds/thousand (s) of local caches at ISPs / IXPs.

And so on.