| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by merty 2304 days ago

This isn’t actually one of those solutions where Lambda shines, pricing wise.

I would simply trigger a Lambda function once a minute (or every X minutes) using CloudWatch to fetch the latest articles and save them to an S3 bucket which I would expose and cache using CloudFront or any other CDN service.

This would lead to:

- No Lambda costs as it would be covered by the monthly free tier of 1M requests.

- No storage costs as the size of the stored data would be extremely small.

- Really fast responses as the “response” would actually be a static file cached at the CDN.

- The only parameter defining your cost would be your CDN of choice, which would cost somewhere between free and as low as $10/TB. For a project like the one in the article, that’s hundreds of millions of requests for just $10.

4 comments

NathanKP 2304 days ago

Yep, that is exactly the architecture that I use to watch over 600k Github repo changelogs for https://changelogs.md

Lambda generates static HTML in the background, puts it in S3, and the static HTML get served via CloudFront

The Lambda costs are a whopping 26 cents per month, for over 2 million Lambda invocations per month. If anyone is interested in the architecture, I've developed this website as an open source project here, for people to learn from: https://github.com/aws-samples/aws-cdk-changelogs-demo

link

rumanator 2304 days ago

> No Lambda costs as it would be covered by the monthly free tier of 1M requests.

That's far from the full picture. AWS Lambdas are charged by units of computational resources that are expressed as multiples of 64MB of RAM used per 100ms, each rounded up to the next value and with a minimum charge of 128MB of RAM used. So you are only charged a fixed fee per request if all your requests are short-lived and barely use any computational resources. Long-lived processing tasks that require a bit of RAM are charged multiple times the value of a single request.

link

merty 2304 days ago

You’re right, I should have mentioned that as well.

I didn’t go into those details because I was strictly talking about the project in the article and the compute time limit would not be exceeded for this project either.

400,000 GB-s is free every month, and even if the Lambda function ran for 2,592,000 seconds (equals to a month, way more than enough) using 128 MB of RAM (again, more than enough for a task like this), it would only use 324,000 GB-seconds.

link

ignoramous 2304 days ago

> I would simply trigger a Lambda function once a minute (or every X minutes) using CloudWatch to fetch the latest articles and save them to an S3 bucket which I would expose and cache using CloudFront or any other CDN service.

Lot of upsides to this design, and this pretty much outlines a toned-down version of a very large, high-throughtput, low-latency globally distributed configuration system with strict write-ordering but near-realtime write-propagation guarantees a sister team worked on (though, I hear, they're redesigning it for reasons not relevant in this context). There is much to like about it.

Fetching items from S3 (fronted by a CDN or not) would require managing credentials at the client-side, though? Do-able but may require additional code for an auth-service (AWS Cognito or AWS STS or...)?

link

merty 2304 days ago

There are many different ways to do it.

You can simply whitelist the IP addresses of the CDN (many of them provide them in their documentation or provide an API for it) in your bucket policy. It’s important to schedule a Lambda to run every now and then to check whether there are changes to the IP addresses and update the policy accordingly.

Another way would be to set a custom header with a token on the CDN to be sent in requests to the origin, which you can, again, whitelist in your bucket policy.

link

tus88 2304 days ago

Why use s3 and cache? Why not just cache the output of the lambda directly? Suspicious.

link