Hacker News new | ask | show | jobs
by adameasterling 1321 days ago
This is from the perspective of a small startup CTO:

We've used AWS Lambda for about 4 years, and it's been so good and so cheap that I'm shifting literally everything (except Redis) to serverless. Also, GCP has a better serverless offering (Cloud Run, Spanner), so we're switching from AWS to GCP to take advantage of that. I bet we're going to see a massive cost reduction, but we'll see.

Things I like about serverless (again, from the perspective of a very small startup, with 5 engineers, and me being the primary architect):

* It's so liberating to not worry about EC2 servers and autoscale and container orchestration myself. All our Cloud Formation templates add up to around 3,000 lines, which maybe doesn't sound like a lot, but it's a lot. There are tons of little configuration things to worry about, and it adds up. (Not to mention the sheer amount of time it took to learn.) ECS Fargate takes care of some of this, but it doesn't autoscale based on demand or anything (not without settings things up yourself). (This is a big reason why I want to switch to GCP: Cloud Run is like Fargate in that it runs containers, but unlike Fargate it autoscales from 0 based on load.)

* It's very cheap in practice, at least for loads like ours that respond to events: API services that sometimes see a lot of use and sometimes see very little use; queue consumers sometimes have a lot to do and sometimes have very little to do. AWS Lambda bills down to the milisecond in terms of resolution, and GCP Cloud Run/Cloud Funcitons bills down to the next 100 miliseconds. These are very fine resolutions and for us at least, we've seen costs be small.

* For database serverless products (like DynamoDB for example), it's very liberating to never have to think "Hm, do we have enough CPU provisioned?"

Things I don't like about serverless

* Pushing source code sucks. Lambda will just one day decide your version of Python or whatever isn't good enough and force your customers to upgrade all their user-written code to the latest Python version. (But! Cloud Run supports containers, and so this won't be a problem.)

7 comments

Does your team do local dev?

Every team I've known that adopted Lambda + DynamoDB (or equivalents) gave up on running their app locally, adding a lot of friction to the development process.

I highly recommend using AWS's own Chalice library as it makes local dev _and_ deployment very easy.

If you need more complex cases like deploying docker containers to a Lambda function, take a look at AWS's SAM library. Also supports local dev _and_ makes deployment easy (its essentially a wrapper around Cloudformation so its very powerful).

Within Lambda, yes, sucks. This is why container-based serverless is so much more exciting. (Which Cloud Run offers.)

I'm in the early stages of this rearchitecture but so far I've had no difficulty with local development.

This is one of my concerns as well - apparently azure functions allows you to debug your function from within vscode. I see lots of issues with this in that 1. you are limited to vscode as your editor and 2. you can only interact with Azure resources in a "development environment" within Azure itself, i.e. no local copy of the database etc
I agree with this line of thought. (Also from the perspective of a small startup CTO with ~10 developers mixed across golang, python, and react.)

We used GCP at our previous startup (sold ) and ran our own K8S, when it was very new (2015). There were lots of pains in those days. So when we started our current startup in 2018, we started with App Engine (flexible, which supports containers). This was fine, but lots of drawbacks. After a year or two we ended up back on K8S, using GCP's GKE (managed K8S). Our team is pretty good with K8S, so it was fine. But regardless, the little stuff adds up.

Fast forward to about 6 months ago. We had used GCP's Cloud Run off and on for little stuff, and it kept getting better. One day someone asked the question why we shouldn't just use it for everything. Everyone was a bit defensive, but we kind of stared at each other and couldn't think of great reasons (for our use case), so we tried it.

Our setup consists of a primary API service (Golang), and a dozen or so smaller microservices, mainly in Python. We even moved most of our React apps to cloud run.

6 months in, and I can't really say anything bad. We turn off scale to 0 for the services where it matters. It scales up quickly to loads, zero down time over 6 months, no troubleshooting (so. much. time. saved.), super easy to deploy, swap traffic between versions, etc.

I'm not saying it a silver bullet, nor that it's perfect for everyone... but I couldn't say enough good things about _container-based_ serverless like Cloud Run.

That said, breaking big systems down to the function level (Lambda, GCP Cloud Functions, etc) sounds like a nightmare to me. I'm sure there are ways, but that's a different ballgame. We do use FaaS for some tasks.

YMMV.

Edit: Oh, and our hosting bill went from ~$5k a month to $500 a month (in part to other things, but primarily the lack of need for big node pools.)

>GCP has a better serverless offering

I am starting to evaluate AWS for GCP for serverless. What, in your opinion, makes GCP better? Is the comment in the context of containers or functions?

I have limited time before my next meeting so I'll type real quick:

GCP Cloud Run is like the best of both worlds between AWS ECS Fargate and AWS Lambda. (Yes, the comment is in the context of containers. Sort of.)

* Like Fargate, Cloud Run hosts containers and takes care of figuring out where they actually live. Unlike Fargate, you don't have to say exactly how many containers you want running at once; GCP will automatically scale the # of containers up and down based on HTTP load and will scale down to 0. This should make Cloud Run cheaper than Fargate. (If you want to hook up Fargate to a webserver and you don't have autoscale figured out, you'll have to keep a lot of workers alive doing nothing.)

* Like Lambda, Cloud Run bills by the amount of time spent processing at least one request. But unlike Lambda, Cloud Run lets one container handle more than one request at a time (it sucks to have to spin up a lot of Lambda invocations that do a bunch of IO). Web servers that are good at concurrency shine here. This should save money.

* Cloud Run has more generous limits in many respects than Lambda. Cloud Run lets you set up SIGTERM hooks, so you can do some cleanup logic in your container (to e.g. write performance data to a timeseries table or whatever).

That's Cloud Run. On the database side: GCP Firestore is very interesting and we're going to build a big feature around it. AWS has nothing like it. On the queue side of things: We're planning to build around GCP Cloud Tasks; We've more or less built Cloud Tasks ourselves using a mix of MySQL and AWS SQS (and it was hard and we haven't done a good job).

I'd love to start a Discord or something to discuss these thoughts more. It's so hard to get good practical information for system architects/CTO types who just need to hammer stuff out.

A couple of points: 1. Cloud Run is more analogous to AWS App Runner than Fargate. 2. Cloud Run isn't a great analog to lambda. Lambda is built to host functions. Cloud Run is built to host applications. Lambda is more analogous to GCP Functions. 3. Cloud Tasks should probably be built with EventBridge + Lambda or EventBridge + StepFunctions or EventBridge + ECS.

I don't profess to be a GCP expert so it's hard for me to make a judgement call on what's better. I can, however, say that most of this post ignores some of the real serverless power provided by AWS. AWS AppSync, AWS API Gateway, DynamoDB, CloudFront Functions, Lambda@Edge. It also makes comparisons that are not very fair.

Huh. I had this long call with our AWS Account Reps (+ Support Engineers) the other day and no one mentioned App Runner! This is the first I've heard of it. Looking at it now.

Ah I see, launched originally in May 2021. That's probably why they weren't aware of it. Yes, this looks cool. Very much what I was looking for.

The differences that I can see are...

* AWS App Runner lacks an advertised free tier. Not a big deal for all but the smallest projects though.

* AWS App Runner bills rounded up to the next second, whereas GCP Cloud Run rounds up to the next 100 millisecond.

* AWS App Runner doesn't charge per request (?!), whereas GCP Cloud Run charges $0.40/1M requests.

* AWS App Runner has fewer CPU/RAM configuration options. The lack of low end options may be a blocker for us.

* It's cheaper than GCP Cloud Run - $51.83 @ vCPU/1GiB, but 2GiB minimum, in Runner vs $69.642 @ 1 vCPU/1GiB (v1) $97.50 @ 1 vCPU/1GiB (v2) in Cloud Run.

* I'm confused by the networking model. In App Runner, you have to make an ENI for your App Runner service to access your VPC? Weird. There's some extra cost there I think.

Things that I can't determine based on the documentation...

* Does App Runner support committed use discounts?

* Does App Runner throw a SIGTERM before shutting the container down? I hope yes but I can't find docs on it.

* Is there a file system accessible on App Runner and is it just in memory or is there actually a disk to write to?

* The quotas & limits page on App Runner feels incomplete and I'm left with a lot of questions about it.

* Is there an SLA?

* In fact the documentation for App Runner just feels a little incomplete.

It looks like AWS definitely wants App Runner to be the answer to Cloud Run, but to me, it feels like it's not quite there yet.

It's also weird, that ECS Fargate lets you run a container without thinking about the server that it runs on, and App Runner does too, just with a few extra things. Why is it a whole separate service? Why didn't they just add it onto Fargate?

Re: Other services. I've only heard of API Gateway, DynamoDB and Lambda@Edge; I'll have to spend time investigating the other ones. Thank you for mentioning them!

I know this is an old thread now, but I just came back to it and thought to dig in a bit. First thing my Googling hit was this, which provides a good comparison of App Runner and Fargate

https://cloudonaut.io/fargate-vs-apprunner/

That lack of WAF support stood out. So Googling that:

https://github.com/aws/apprunner-roadmap/issues/58

"Hello, we are looking at supporting WAF in App Runner and will have more updates on this thread going forward. "

this falls in line with how i feel about cloud run, it really feels like a much better abstraction than ecs/gke or functions. it also is more similar to how most devs currently work, and local dev is the same. for non-api based traffic, like queues, it has some really weird quirks, like the autoscaler is problematic. but our experience with gcp in general has ranged from mediocre to bad, where the tech seems cool but things dont quite fit together
This has been my experience so far too. Serverless is amazing, it does require some shift in thought when it comes to backend architecture but once you get there it really does provide everything it promises. Less cost, less complexity (if done right), and the scalability is amazing. I agree it's not perfect but I highly recommend any backend or fullstack dev take a look at if you haven't already.

Been working on a lambda, dynamodb, typescript react app for about 6 months for work and it's been just mind-melting how much money and complexity we've saved switching. I'm talking like 5x the cost drop and we can more easily onboard devs to the project because it's just simpler, no need for a devops hire honestly.

Can you expand a bit on complexity part? How exactly serverless reduces it?

The only thing that comes to mind better isolation between parts of the app, but this could be achieved with any architecture if done right.

We don’t have to manage a server, that’s really the drop in complexity. We just write a function and it runs on a machine somewhere in the cloud. As we scale, AWS just handles it automatically. If the demand decreases, no problem. We just pay for what we use. Of course there are ways to handle scale with servers, but with serverless you barely have to think about it.
> I'm shifting literally everything (except Redis) to serverless

A new company called Momento just launched that offers a serverless cache. Might be of interest to you.

https://www.gomomento.com

AWS Lambda supports shipping your Lambdas as containers now. Very nice experience.
Was going to comment this same thing. We've been finding that the cold-start times are somewhat worse, but not disastrously so.
I just started working with Lambda, so my experience may just be the teething pains, but. I find the developer experience a bit jarring. My lambda is in python, but every time I want to test my code I have to "build" the lambda with the `sam build` tool and only then can I exercise it. It takes a non-trivial amount of time to build.

The deployment story looks good though, with the `sam deploy`, but for now I can't get over the developer experience.

My recommendation is to invest in testing now while it’s easy. You can mock the incoming events and use something like moto for service mocks. That catches dumb mistakes sooner.

Also, don’t put any logic in the handler itself unless it’s extremely trivial.

Sound advice, but that ship has sailed. This is an existing lambda that I'm refactoring so I need to be able to "go through the front door", as it were. Once I become more familiar with the code, I'll be able to test individual pieces are you're recommending.