Hacker News new | ask | show | jobs
by untog 3773 days ago
Off topic, but I'm hoping there are some Lambda-heads in the room. I want to write a system that basically rebroadcasts a message sent over SNS, to different HTTP endpoints. (I don't have control over these endpoints so can't use SNS itself as I can't confirm subscriptions).

How many HTTP requests can Lambda do concurrently? Is my best approach to fire all these requests inside one worker, or should/could I have it spin up subsequent lambadas whose only function is to run the HTTP request then close? I'm imagining that would be a lot more expensive.

2 comments

There's a per-invocation cost that, at ridiculous volumes, becomes non-trivial ($0.20 per million). We get very high throughput (my back of the envelope math says about 3 writes per millisecond per Lambda, that's to a Cassandra cluster) for I/O intensive operations. You can have up to 100 simultaneous invocations, and you can ask for more (we did). Without knowing more about your situation, I would suggest that you use a library that lets you fire off a bunch of async requests and block on them all. Play around with RAM/CPU (one knob for both)--a higher setting may result in quicker processing at a lower cost (!). If you're highly cost sensitive, consider batching your SNS messages--remember that it supports 64K payloads. (We use SNS to do batchloading, actually--it's a cheap, managed alternative to Kinesis.)

Should you choose the fanout route, Tim Wagner from AWS told me that it's pretty fast: https://twitter.com/timallenwagner/status/658025794900365312

My guess: all you can do in the 300 sec execution limit
The tricky part there is that it wouldn't work if you just sat there in a tight look dispatching http requests, any one of them timing out would, likely, trigger the deadline and make all subsequent http requests not happen.

So, alternatively, you could do something with DynamoDB event sources, where you have some sort of pub/sub table that your lambda functions listen on (basically a list of all the http requests that have to happen) - thus keeping a minimal 1 lambda dispatch per http request. The catch is you would need another system to manage that table (technically that system can be lambda itself).

Two important things, 1) I haven't used the dynamodb/lambda integration myself so be skeptical of my suggestion and 2) what I can say from our usage of the s3/lambda integration is that concurrency is not a problem with thousands of lambda dispatches/second being surprisingly quick to spin up.