How do you deal with Lambda concurrency? I have found its pretty easy to hit 1K concurrents if functions take a long time to run and receive bursty traffic.
You can IIRC ping support and ask for a concurrency limit increase, but probably what I would do first is try to segregate lambda deployments and API endpoints (or whatever trigger) by region so that total load is distributed (you get 1000 concurrents per region). Obviously at this point you would also profile your code to optimise function executions.
Do you mean you don't want it to handle 1k concurrent requests (you want some to be rejected or queued instead?) or do you mean that the concurrent execution causes some other problem?
Right, I'm referring to AWS limits. I was running a benchmark yesterday against a logging endpoint I made with a similar architecture to the article. One function is attached to a public ALB endpoint and does some validation then writes the event to SQS; this was taking 100-200ms with 128Mb of RAM. A second function was attached to the SQS queue; its job was to pull events and write them out to an external service (Stackdriver, which sinks to BigQuery). This function was taking 800-1200ms at 128Mb RAM, or 300-500ms at 512Mb (expensive!).
While running some load testing with Artillery I found that I was often getting 429 errors on my front-end endpoint. When pushing 500+ RPS, the 2nd function was taking up over 50% of the concurrent execution limit and new events coming into the front-end would get throttled and in this case thrown out. That also means that any future Lambdas in the same AWS account would exacerbate this problem. Our traffic is spiky and can easily hit 500+ RPS on occasion, so this really wasn't acceptable.
My solution was to refactor the 2nd function into a Fargate task that polls the SQS queue instead. It was easily able to handle any workload I threw at it, and also able to run 24/7 for a fraction of the cost of the Lambda. Each invocation of the Lambda was authenticating with the GCP SDK before passing the event and the Lambda has to stay executing while the 2 stages of network requests were completed.
I'm happy to report I haven't been able to muster a test that breaks anything since I started using Fargate!
> the 2nd function was taking up over 50% of the concurrent execution limit and new events coming into the front-end would get throttled and in this case thrown out.
It sounds like you already found a great solution for your particular case. But it's also worth mentioning that you can apply per-function concurrency limits, which can be another way to prevent a particular function from consuming too much of the overall concurrency. For anyone who's lambda workload is cheaper than a 27/7 task, that could be a good option.
> Each invocation of the Lambda was authenticating with the GCP SDK before passing the event
I'm curious whether you tried moving the authentication outside of the handler function so it could be reused for multiple events? I've found that can make a huge difference for some use cases.