| In the preview release, you can set the timeout value to anything between 1 second and 60 seconds. If you don't specify a timeout value, the default is 3 seconds. I can stream a 100 MB chunk from S3 and map it concurrently as it streams in 10 to 15 seconds. Sixty seconds is more than enough time to process a chunk. The bigger issue is that during the preview, Lambda is limited to 25 concurrent functions. If Amazon delivers a product where "the same code that works for one request a day also works for a thousand requests a second[1]," then you might be able to analyze hundreds of gigabytes of data in a few seconds, spin up no servers, and only pay for the few seconds that you use. 500gb = 5000 chunks of 100mb each. 1000 concurrent tasks each running 10 seconds could process 500gb in 50 seconds. You would use 5000 Lambda requests out of your free monthly allotment of 1,000,000. You'd also consume 5000 * 0.1gb * 10 seconds = 5000 gb-sec of your free monthly allotment of 400,000. S3 transfer is free within the same region, and S3 requests cost $0.004 per 10000 GETs, or $0.002 for this query. Even after you exhaust the free Lambda allotment, processing 500gb would cost $0.000000208 * 100 * 5000 or about 10 cents. Scaling this up, querying 10 terabytes would take about 20 minutes to execute, cost $2 for the query, and about $300 per month for storage. For sporadic workloads it might be more responsive and much cheaper than spinning up a fleet of machines for Hadoop or Spark. [1] http://www.allthingsdistributed.com/2014/11/aws-lambda.html |