Hacker News new | ask | show | jobs
by nothrabannosir 2872 days ago
That calculation doesn’t include the salary of the person managing those EC2 instances. Lambda is easier to manage and costs less time than an EC2 box. SSH keys, backups, AMI base images, system updates , it all adds up. Ansible and terraform are so far removed from the core competency of any company that you really need to ask yourself, for any second you spend working on them: is this worth it? At almost no company’s scale* is it so.

Salary is a hidden cost people often forget. Or time spent by your programmer’s debugging infra; time not spent creating value. Frustration from having to deal with TF or ansible in the first place.

These are all benefits of a serverless solution which cost aware criticisms should at least try to quantify and take into account.

Signed, not-a-shill.

* I’ll rephrase: a significant amount of companies don’t have the luxury of being so swamped by requests they can saturate EC2 boxes and make the devops overhead (which, I’m arguing, is big but relatively constant/“sub linear”) worth the price difference (which scales linearly). Significant enough for these articles to have a raison d’être.

5 comments

In my experience, the developer time it takes to get a serverless application running (on AWS anyway) far exceeds the benefits that serverless brings. We spent over six months on a project written in serverless that we could never really get working properly. We dumped all that code and rewrote it on a standard LAMP stack in less than half that time. Factoring in that the problem was better understood the second time around, it would still have been cheaper to build it the way we are from the beginning.
Was that due more to lack of training? I am first and foremost a developer, but I’ve spent weeks understanding AWS from a netops, devops and development standpoint.

I’m also working on certifications mostly as a method to force me to learn in a structure manner, my company will pay for them, and they still have some market value.

Training also takes time.
Are you proposing that developers shouldn’t study and learn about whatever infrastructure they are using?
I'm not proposing anything. I'm saying that training should be factored in when considering development costs. The notion that training is always a one-time investment that can be pretty much ignored if you plan to stick with some tech long-term is a fallacy.
Things get added to AWS all of the time, but if are using the same set of services, training is a one time thing. Amazon doesn’t just pull the rug out from under you.
Isn't that the whole point of the cloud?

I mean, if you care about the infrastructure there are few reasons left not to self-host.

The point of the cloud is not to be ignorant about the infrastructure. It’s to not have to babysit hardware and focus on your core competency - let someone else do the “undifferentiated heavy lifting”. I’ve seen cases where “AWS Architects” spun up a bunch of EC2 instances and ran thier own services that had AWS managed equivalents and wonder why everything costs more.

Could it possibly be things like they have 3 EC2 instances to run a cluster for Consul instead of using AWS services?

It’s slso about elasticity. It’s much easier and cost effective to spin up 20 VMs (whether it be EC2 instances or Lambdas which are basically VMs) to run a test and see how many you actually need.

Other times it makes sense to use a bunch of spot instances to save money and choose a cost optimization based on throughout vs. cost for backend processing.

Salary of developers and sre that needs to manage this either way?

Lambda and api gateway is nothing but expensive vendor lock-in.

Excellent point. That's why we don't bother with server virtualization, Docker, or Ansible.
Docker, ansible, red hat stuff are all open source that you can run on your own. There is no vendor lockin.
My last job was running a lot of open source stuff and they spent maybe 20 hours per week troubleshooting things, multiple people together, while the rest of the team was waiting.

It was a complex environment with high availability though, with maybe 15 different types of software trying to work together...

After using AWS for a while, I have to say that things usually work and don't suddenly start to malfunction, even when not doing serverless.

Downsides of serverless is definently vendor lock in, but I think having the complexity of open source software in house, and being able to try new software, is a double edged sword that can cost huge money as well.

It's not a simple choice.

It's not just lockin. It's troubleshooting Lambda. Here's a couple of questions and answers we've recently delved into:

- How do I avoid a cold start? (Fake calls into the system to the point where AWS has to start more instances of your application)

- Why is my cold start time different between development to production? (Production is in a VPC, which has greater instance spin up times)

- Why has Lambda stopped executing on my Kinesis stream? (Oops, AWS' bad)

- How can we get more parallel Lambda processing on a Kinesis stream? (Clone the stream and attach to the cloned stream)

- What can I do to reduce the cold start costs for Java? (Specify your objects as static, since those are evaluated at JVM startup, which AWS doesn't charge you for)

- How do I do Infrastructure as code? (CloudFormation; expect an average of 4 CF objects per function. SAM makes the CF more user friendly, but it's harder to debug and encourages the creation of extra infrastructure)

- How do I increase the bandwidth allocated to my Lambda? (Increase the memory allocation)

What can I do to reduce the cold start costs for Java? (Specify your objects as static, since those are evaluated at JVM startup, which AWS doesn't charge you for)

Well Java was the first mistake if you want a reasonable startup....

But they significantly reduce the workload on the people that would otherwise have to manage those things, just like a serverless architecture does.

If you want to argue about vendor lockin, sure. But removing the necessity of managing scaling, individual instances, and OS / software patching is significant.

> But removing the necessity of managing scaling, individual instances, and OS / software patching is significant.

With AWS (and other cloud provider) APIs and tools like Packer, these difficulties are vastly overstated.

Building VM images is just another step in the CI/CD pipeline, and patching and deploying a zero day fix becomes "kick off the CI/CD pipeline". You can even do automated unit testing of an image with tools like Inspec.

Scaling is an API call to change a "desired instance count" value (if it's not already automated), and complex problems with any individual instance can be resolved with STONITH (terminating the instance).

> Building VM images is just another step in the CI/CD pipeline, and patching and deploying a zero day fix becomes "kick off the CI/CD pipeline". You can even do automated unit testing of an image with tools like Inspec.

There is a non-zero cost in maintaining that process, including paying people to know and understand things like Inspec, packer, and Linux troubleshooting. There are also full OS upgrades where assumptions made could be invalidated, along with revising your process accordingly.

> Scaling is an API call to change a "desired instance count" value (if it's not already automated)

That automation is significant complexity. You'll be maintaining whatever health / resource checks are necessary to determine when scaling up is necessary, when scaling down should be done, what initialization / teardown tasks need to be done, etc. You'll also need some kind of health checks / monitoring to ensure this process is operating as it should so that you can detect if there's a problem with it. All of that needs to be known / understood / documented / maintained by someone.

And that's only for the stateless part. If you're trying to do with same with a relational database, it only gets tougher.

> and complex problems with any individual instance can be resolved with STONITH (terminating the instance).

Only if the problem is truly non-recurring and only in a single instance. Otherwise, it will be Linux troubleshooting to find out if it's your software, an OS patch, a third party software patch, or some other issue.

Nothing is reduced just shifted away from your perspective. You pay for the server running your serverless code. You just share that cost.

Sharing the cost brings benefits but it also brings pain. First time you hit an AWS hard limit you will realize how much pain

> Nothing is reduced just shifted away from your perspective. You pay for the server running your serverless code. You just share that cost.

This is no different from any other abstraction.

> Sharing the cost brings benefits but it also brings pain. First time you hit an AWS hard limit you will realize how much pain

Many AWS services have no architectural scaling limits. S3, Lambda, API Gateway, DynamoDB, SNS, and SQS are examples. And you can always take comfort in knowing that many people are actively using the service at far greater scale than you ever will.

That's not to say they won't ultimately limit (e.g.) how many S3 buckets you create, but being able to create unlimited S3 buckets is not necessary to use S3 as designed in a near-infinitely scalable way.

So where exactly are we going to run this “on our own” in a regionally or globally distributed manner? Then we have to pay more people for netops to do things that doesn’t have any competitive advantage.
Lambda is often the way to go, but there are a couple tipping points where it becomes less appealing:

- When you start having to spend a lot of time working around its limitations. (As is the case with all managed services.)

- When the sum of its parts create a distributed system whose complexity costs you more in productivity than it gives back.

Performance limits will get better with time. The complexity cost is more difficult to predict. Hopefully it will improve as best-practices emerge and tooling gets better.

Debugging, organizing, and testing lambdas vs traditional stacks is the hidden costs on the developer side.
Testing a lambda is just as easy as testing a controller action in the typical MVC framework. You just pass a JSON payload to your lambda handler.

Organizing is also simple - if you know CloudFormation. You should learn that anyway to provision resources in a repeatable fashion.

Very few companies are big enough that they can make the devops overhead worth the price difference, but most of those plan on eventually being so. So there's an argument to be made that the knowledge should be grown in house while you can still tolerate the small outages you get when you have stuff misconfigured or unbalanced.

Probably not a good argument though. If you're successful enough to be saturating servers you're probably successful enough to be hiring experienced competent help for the transition.