| I remember a few years ago we tried to implement a scheduled Lambda that needed to download a bunch of files from an S3 prefix, perform some aggregation on the data and then write the result to a database. Our EC2 prototype of this on one of the m3 class instances could do the work in about 2 minutes which seemed a perfect opportunity to port to Lambda. Even on the top memory instance at the time (1536mb), the job just couldn't finish, timing out after 5 minutes. The code was multi threaded, to parallelise the downloads, but not matter how much we tweaked this the Lambda would just never complete in time. As you don't have visibility of the internal we didn't know whether this was due to CPU constraints (decompressing lots of GZIP streams), network saturation (downloading files from S3) or what. In the end we gave up. Didn't have the time or resource to keep digging, and just pinned the problem on the use case we were trying to fit was against what Lamba is designed for Not saying this is an indictment of Lambda, we use it in lots of places, with a lot of critical path code (ETL Pipelines). |