Hacker News new | ask | show | jobs
by moduspol 2875 days ago
Excellent point. That's why we don't bother with server virtualization, Docker, or Ansible.
1 comments

Docker, ansible, red hat stuff are all open source that you can run on your own. There is no vendor lockin.
My last job was running a lot of open source stuff and they spent maybe 20 hours per week troubleshooting things, multiple people together, while the rest of the team was waiting.

It was a complex environment with high availability though, with maybe 15 different types of software trying to work together...

After using AWS for a while, I have to say that things usually work and don't suddenly start to malfunction, even when not doing serverless.

Downsides of serverless is definently vendor lock in, but I think having the complexity of open source software in house, and being able to try new software, is a double edged sword that can cost huge money as well.

It's not a simple choice.

It's not just lockin. It's troubleshooting Lambda. Here's a couple of questions and answers we've recently delved into:

- How do I avoid a cold start? (Fake calls into the system to the point where AWS has to start more instances of your application)

- Why is my cold start time different between development to production? (Production is in a VPC, which has greater instance spin up times)

- Why has Lambda stopped executing on my Kinesis stream? (Oops, AWS' bad)

- How can we get more parallel Lambda processing on a Kinesis stream? (Clone the stream and attach to the cloned stream)

- What can I do to reduce the cold start costs for Java? (Specify your objects as static, since those are evaluated at JVM startup, which AWS doesn't charge you for)

- How do I do Infrastructure as code? (CloudFormation; expect an average of 4 CF objects per function. SAM makes the CF more user friendly, but it's harder to debug and encourages the creation of extra infrastructure)

- How do I increase the bandwidth allocated to my Lambda? (Increase the memory allocation)

What can I do to reduce the cold start costs for Java? (Specify your objects as static, since those are evaluated at JVM startup, which AWS doesn't charge you for)

Well Java was the first mistake if you want a reasonable startup....

But they significantly reduce the workload on the people that would otherwise have to manage those things, just like a serverless architecture does.

If you want to argue about vendor lockin, sure. But removing the necessity of managing scaling, individual instances, and OS / software patching is significant.

> But removing the necessity of managing scaling, individual instances, and OS / software patching is significant.

With AWS (and other cloud provider) APIs and tools like Packer, these difficulties are vastly overstated.

Building VM images is just another step in the CI/CD pipeline, and patching and deploying a zero day fix becomes "kick off the CI/CD pipeline". You can even do automated unit testing of an image with tools like Inspec.

Scaling is an API call to change a "desired instance count" value (if it's not already automated), and complex problems with any individual instance can be resolved with STONITH (terminating the instance).

> Building VM images is just another step in the CI/CD pipeline, and patching and deploying a zero day fix becomes "kick off the CI/CD pipeline". You can even do automated unit testing of an image with tools like Inspec.

There is a non-zero cost in maintaining that process, including paying people to know and understand things like Inspec, packer, and Linux troubleshooting. There are also full OS upgrades where assumptions made could be invalidated, along with revising your process accordingly.

> Scaling is an API call to change a "desired instance count" value (if it's not already automated)

That automation is significant complexity. You'll be maintaining whatever health / resource checks are necessary to determine when scaling up is necessary, when scaling down should be done, what initialization / teardown tasks need to be done, etc. You'll also need some kind of health checks / monitoring to ensure this process is operating as it should so that you can detect if there's a problem with it. All of that needs to be known / understood / documented / maintained by someone.

And that's only for the stateless part. If you're trying to do with same with a relational database, it only gets tougher.

> and complex problems with any individual instance can be resolved with STONITH (terminating the instance).

Only if the problem is truly non-recurring and only in a single instance. Otherwise, it will be Linux troubleshooting to find out if it's your software, an OS patch, a third party software patch, or some other issue.

TL;DR: Both paths require knowledge beyond how to write a web application; using Lambda doesn't absolve you of having to learn about or hire someone to manage your infrastructure.

> There is a non-zero cost in maintaining that process

Just as there is a non-zero cost associated with maintaining Lambda, API Gateway, and the associated CloudFormation scripts, and finding people who can (and are willing to) maintain them.

> That automation is significant complexity

99% of that complexity is already shouldered by AWS and their ilk. They implement log forwarding, metric dashboards, instance health checks, and simple (complete) examples of how to scale based on CPU and memory - the two metrics used for scaling in most cases.

As for OS upgrades, yes, those can require a bit more expertise. That said, those occur every two to four years, and for the past few OS upgrades I've had to handle, the pain was limited to converting sysvinit scripts to upstart scripts, to unit files (none of which were strictly required, as an aside, since both upstart and systemd support sysvinit scripts natively).

> If you're trying to do with same with a relational database, it only gets tougher.

You mean RDS? Databaes need to be maintained no matter how the application is run. For a quick personal anecdote, there's a world of hurt waiting unless someone is hired who knows how to manage and tune databases, no matter who runs the infrastructure.

> Otherwise, it will be Linux troubleshooting to find out if it's your software, an OS patch, a third party software patch, or some other issue.

How is this different? Linux troubleshooting skills won't help to identify if it's third party software or your software - and those pains don't go away magically with Lambda. In the exceptionally rare case that it is the OS, it will be fixable by kicking off your CI/CD pipeline.

A small tip: Like compilers, the problem isn't the OS. Even when you think it's the OS, it's not. It's your software. OOM killer taking out processes? Those processes are leaking memory. Running out of disk space? Clean up the logs. Cron is misbehaving? Fix the typo. It's also worth mentioning that all of those problems are at least temporarily resolved by STONITH; enough to give time to fix the application.

> Just as there is a non-zero cost associated with maintaining Lambda, API Gateway, and the associated CloudFormation scripts, and finding people who can (and are willing to) maintain them.

This is mostly true, although "maintenance" for those things is minimal. There are fewer moving parts you are responsible for maintaining, and the ones requiring ongoing changes (OS and software management) don't exist. To maintain an existing web application, you are on the hook to potentially ship updated libraries (but not runtimes) in your functions, and to pay your AWS bill. This is like a half-step above no maintenance at all.

If I build a web application for a client that I deploy using a modern serverless architecture, it will require virtually no hands-on maintenance from me for... years? If I build a web application with a more traditional stack, I will definitively need to charge some amount for maintenance because it's not feasible to ignore patching or assume patching won't break all the automation I'd have around scripting, health monitoring, deployment, and everything else.

That's a significant difference.

> 99% of that complexity is already shouldered by AWS and their ilk. They implement log forwarding, metric dashboards, instance health checks, and simple (complete) examples of how to scale based on CPU and memory - the two metrics used for scaling in most cases.

At what number do I scale up? At what number do I scale down? How do I detect when there's a problem with the instances coming up? And I'm familiar with AWS--they certainly help with those things, but it's still on you to have the log forwarding agent running on your box, to set up the dashboard, to ensure you have the separate agent running on your box to forward memory usage metrics, and to ensure you're not doing anything that won't break your automatic minor version upgrades for your AMI (or manage your own, if you're not using EB or don't use that feature).

It's a whole lot better than doing it without those AWS services, but it's a significant step away from what you get with a serverless architecture.

> You mean RDS? Databaes need to be maintained no matter how the application is run. For a quick personal anecdote, there's a world of hurt waiting unless someone is hired who knows how to manage and tune databases, no matter who runs the infrastructure.

If you use RDS, sure. If you're using DynamoDB or (soon) Serverless Aurora, it doesn't require nearly as much tuning or babysitting.

> How is this different? Linux troubleshooting skills won't help to identify if it's third party software or your software - and those pains don't go away magically with Lambda.

Sure they can. Linux troubleshooting skills would tell you if an updated third party tool is now leaking memory, for example. And they often do go away with Lambda because your functions run on a level of seconds or minutes instead of hours, days, or more. Every function is effectively terminated every few hours at most. You could have problems with your libraries, but that's a much more limited troubleshooting scope than an entire VM.

Nothing is reduced just shifted away from your perspective. You pay for the server running your serverless code. You just share that cost.

Sharing the cost brings benefits but it also brings pain. First time you hit an AWS hard limit you will realize how much pain

> Nothing is reduced just shifted away from your perspective. You pay for the server running your serverless code. You just share that cost.

This is no different from any other abstraction.

> Sharing the cost brings benefits but it also brings pain. First time you hit an AWS hard limit you will realize how much pain

Many AWS services have no architectural scaling limits. S3, Lambda, API Gateway, DynamoDB, SNS, and SQS are examples. And you can always take comfort in knowing that many people are actively using the service at far greater scale than you ever will.

That's not to say they won't ultimately limit (e.g.) how many S3 buckets you create, but being able to create unlimited S3 buckets is not necessary to use S3 as designed in a near-infinitely scalable way.

>Many AWS services have no architectural scaling limits. S3, Lambda, API Gateway, DynamoDB, SNS, and SQS are examples. And you can always take comfort in knowing that many people are actively using the service at far greater scale than you ever will.

All of those services have limits, just because you havent experienced that scale doesnt mean others havent.

> All of those services have limits, just because you havent experienced that scale doesnt mean others havent.

See the sentence literally immediately after the one you quoted.

People are already using those services all day every day at far greater scale than most of us can ever hope to achieve. It's meaningless to pontificate on theoretical limits when the practical ones are higher than you can reach.

So where exactly are we going to run this “on our own” in a regionally or globally distributed manner? Then we have to pay more people for netops to do things that doesn’t have any competitive advantage.