This is huge for Lambda. It allows devs to create “serverless” apps [1], with relational databases, without 10+ second cold-start times. In the article, they measure it as 988ms.
I have tried building an API using API Gateway <-> Lambda, but had to choose between using DynamoDB to store data (no-SQL, so challenging to query) or suffering unacceptably long response times whenever a request happens to cause a cold-start. Theoretically, this problem is now going away!
But VPC is not an especially efficient additional "defense-in-depth" layer against this kind of "fucked up both firewall and password" configuration mistake. The first 2 obvious ones are passwords, network-level firewalling, host-level firewalling of course, and after that you can add monitoring / port scanning for all your "must be firewalled" services. And you can mandate better-than-passwords authentication methods[1]. Etc. The latter is better because it is more general and doesn't add costly complexity to your networking topology (by way of NAT and/or ambiguous rfc1918 addressing)
Are you proposing that by acknowledging defense-in-depth, consistency dictates that one should pile up as many layers per attack vector as possible? Maybe, if you have infinite resources and don't need to make compromises on where you spend effort and resources in your risk management plan. But that's rarely the case in the real world.
You raise a fair point, this was possible, although it seems safe to say it would be a compromise on security.
I think it’s best not to expose the DB to outside connections in general, although it is still possible [1] when using RDS instances.
I think this is different for things like DynamoDB because, instead of a standard SQL-like db “connection”, they use AWS role-based auth for each request.
Of course, one could always configure some type of proxy service between the lambda and the DB... but that seems antithetical to going “serverless” in the first place.
I think Aurora Serverless has even worse [1] cold-start times (for the DB itself), and it was intended as more of a price-optimization than a performance boost.
Aurora Serverless also handles connections. The problem of having a burst of 1000 concurrent invocations accessing your databases still exists even with VPC access
If you put an event bus in the middle (kinesis) your api-lambda functions don't need direct access to your RDS.
Subscribe lambda functions to your kinesis stream, and let them handle the link to your RDS. This way you wont notice the cold starts.
This comment doesn't seem to make sense, could you elaborate a bit? How would you replace the database working as a persistence layer to an API application by polling an event stream?
It doesnt replace the DB. It just uses Kineses to be the messaging provider from the lambda to the DB and back. Not sure that's a great idea TBH but who knows?!
AWS announced this enhancement at 2018 re:Invent. It was slated for "sometime in 2019". I was excited, and I'm impressed that they released the feature well ahead of the end of the year (and before the next conference, which would obviously raise a few questions)
This has been a /major/ sore point for Lambda use, amazing they fixed it, and always great to see they've documented the intense engineering requirements involved to make it happen.
AWS is a beautiful mix of business and technology, it's very rare to see such a large engineering-driven organization managing to balance customer friendliness. I'm an unashamed fanboy
I’m one of the harshest critics of “lift and shifters” - old school net ops people who get one certificates by watching an ACloudGuru video, duplicate their on prem infrastructure and processes to the cloud and don’t go all in on the advantages of it and end up costing their clients more - but nowhere is it considered “legacy” to not use perimeter security.
Solves might be strong, but it removes a big portion of the cold start latency that was difficult to optimize for and out of the control of developers. Creating minimal images isn't difficult for a number of environments (e.g. webpacking your node.js lambdas) and barring necessarily large images (think pandas on Lambda) this puts a lot of control for the cold start p99 back in the hands of customers.
Warming functions in the previous VPC architecture was always a questionable practice. You had no guarantee that your environments would be warm across all subnets or which subnets would handle incoming requests. Beyond that, what happens to requests which you receive when the function is being warmed? You still incur cold starts.
There has never been a guarantee of environment reuse. Any architecture which isn't capable of incurring cold starts is not a good fit for serverless.
Sorry but it does not matter how many since everything is automated and you create the warm up scheduler when you create the function. As other pointed out in this thread that are other challenges with this approach.
>> Just use Fargate
We were trying to and we decided that is not our cup of tea. Lambdas are.
Yes it does matter. In your scheduler, how do you ensure your ping (the way you start an instance) is actually creating another instance to keep warm or reusing another instance?
If you want to always keep 20 instances warm, you have to keep the first ping active until the 20th one is done.
In other words, if you want to keep 20 active instances warm and you send 20 requests in 5 seconds, if each request only takes .25 seconds. You will only have 5 warm lambdas. The 6th real concurrent connection will still have a cold start. Also while you are pinging the request to keep it warm, that instance can serve a real user.
Also, API Gateway has an algorithm to decide whether to launch a new lambda are cache a request hoping that using an already warm lambda will free up.
This is great news, but I'm bummed they didn't bundle the NAT gateway with this service. In a typical function that calls out to get data from a service and reads/writes from a DB in a VPC, that requires the somewhat painful configuration of a NAT gateway and dedicated subnets, as well as a $36/month bill for the NAT gateway service.
There are some workarounds that using multiple lambdas, but they have their own gotchas.
Still, hooray, this is good news. The Data API is great for Serverless Aurora, but I can't use that with BI tools.
This is not meant to be a criticism of AWS, I’m an AWS true believer, but the main purpose of going to AWS is to make the “undifferentiated heavy lifting” someone else’s problem not to save money.
Going to AWS to save money on resources is about like going to the Apple Store to buy a cheap laptop.
I’m not against AWS or $36/mo. It just is kinda a drag when the promise of serverless is pay per user and scaling to zero. You could get a nice EC2 t3.medium and do a lot more RPS for the cost of that NAT and Lambda invocations.
If you don’t care about cold starts, there is always Aurora Serverless with the Data API. I don’t believe it requires either a NAT or for the lambda to be attached to your VPC.
AWS does tons of stuff around VPCs....I feel like they really want me to use them (or their customers really want to use them), but I just don't see why.
I just run RDS on the internet. I don't have to muck with the complexity or cost of NATs or peering or Lambda slow start or any other weird networking issues.
I know it's "public", but that seems irrelevant in the era of cloud services. This isn't any different than, say, how Firebase or a million other services run. Should I be concerned that my Firebase apps are insecure because someone isn't overlaying a 10.* network on them?
EDIT: I should clarify that I understand the legitimacy of security groups, especially for technologies that weren't meant to operate outside a firewall. But that's mostly a different subject; AWS had security groups years before VPCs and subnets and NATs.
So making the actual listening port for a database server "public" is generally a bad idea as that is another attack surface of code that honestly is hardly ever made public... but if when you say "public" you mean you are using security groups (which are super trivial to use and easy to understand) to define which other AWS devices can access the port, then yeah: I have never seen any reason why this entire feature should exist and the concept of having to think about IP address ranges as if they somehow matter is one of the things I was escaping when I moved to cloud in the first place, and somehow they wanted to reintroduce it? Why?!? It doesn't even work well (!!), and introduces tons of latency into everything it touches (not just Lambda) :/.
VPCs are very helpful for when you have a large number of developers working in an AWS environment. It'd be oh so easy for a developer to accidently change a bit of terraform and expose your database to the internet without VPCs.
My theory is that a bunch of entrenched network engineers just really like subnets and IPv4 and NAT and don't realize how mostly unnecessary it is in an era of cloud infrastructure and IPv6.
I think it is easy to have dev, qa and prod VPCs. Without VPC these separate infrastructure groups might be harder to split out. I usually reference security groups instead of subnets in security groups, avoiding referencing IP ranges (v4 or v6) entirely.
You can also use multiple AWS accounts to separate those environments, which also eases user management (usually you have different people with access to each environment, with some overlap).
This also means that developers can have close to admin privileges, since the worst they can do, is to disrupt work of another developer, without affecting either QA or production.
Accounts are the correct level to separate these at. Keeps credentials easier to manage for devs, techs, etc, and limits blast radius if unauthorized accesses take place.
Which is really an old school way of doing it. Having multiple VPC's doesn't get around account limits, resources with the same name (sns topics, queues, stacks, etc.)
Just use separate accounts in an Organization.
You can give your developers almost complete unrestricted access to your dev account.
I work with both style of AWS installations. Having organization is a constant pain for people who access multiple accounts even with something like Okta. 1 browser can access 1 account and if you switch you have to go through the switching process, or use multiple browsers. Quite often people would like to access cross account resources which a whole different level of discomfort. This is why I still think, old school or not, that having a single account and multiple VPCs is a better option.
This is true, AWS is pretty anti-internet in all their architecture recommendations. IMO security is better done by firewalling and protocol level authentication (belt + suspenders) because it keeps your configuration clean and understandable, and complexity is the enemy of security.
The attitude has two things in AWS interest: 1) keep lock-in by encouraging customers to build AWS-internal networks 2) don't scare away the lift-and-shift customers who want to transplant their 1990s style "intranet" (or mental model, at least) onto AWS.
Explains also why they aren't very keen about IPv6 because that would encourage internetworking.
Just don't tell anyone that you can access the AWS console from the internet :)
It’s never been considered best practice to expose services needlessly to the Internet. I’m as far from an old school net ops guy as you can get and jump at any new AWS technology that’s feasible as anyone but it would be the height of stupidity for me to expose my Aurora cluster to the Internet. Good luck explaining that to your external auditors.
Of course. I'm just saying that firewalling and end-to-end security are better ways of doing that than routing and ambiguous (rfc1918) addressing. Never trust the network, lest you end up making yours soft and chewy on the inside.
How do you propose you firewall your database access and only allow certain IP addresses when you need access from lambda when the lambda is always run from a random location on AWS’s network?
A lambda is never run “from within your VPC”, it’s attached via an ENI (or at least it was).
Yeah, this kind of thing is part of what I meant when I criticised AWS encouraging VPC use instead of end-to-end security.
But off the top of my head, you could always use the firewall API from the lambda to open network access between it and the RDS when the lambda starts. (In addition to using certs or IAM security on your TLS connection to the RDS db)
VPCs are very useful when running things like elasticache though( memcache and redis), because AFAIK those don't have an authentication ecosystem so making them public would be a terrible idea.
Memcache has had reliable authentication (SASL) for some time. Redis has authentication meant to be a secondary protection.
But that's a good point.
I suppose all the services I use already have security models (usually more complex, multi-user ones, so agent X can read but not modify, etc.).
HOWEVER...this could be solved with security groups, but it seems that's not the model AWS has emphasized. Security groups are orthogonal to NAT and private networks; AWS had security groups before it had VPCs.
Firebase was made specifically for the cloud, RDS is the cloud atop postgres, I don't know how secure RDS is (against the myriads of attacks) but it wouldn't be bad idea to use the built-in aws firewall to at least restrict access to trusted IPs ;)
Also, VPCs are really useful if you have many systems and services(yours or theirs) inside AWS.
Exposing a database to the public internet is a terrible idea. Yes, it's behind an auth layer, but is a username and password really enough protection for literally all of your company's data? Heck most people here have probably set up 2FA for their social media profiles, and for good reason.
Not sure about Firebase, but DynamoDB can be behind your VPC. From what I know about Firebase, it's meant to be a backend for mobile apps, so I guess it makes sense for it to be public.
> AWS does tons of stuff around VPCs....I feel like they really want me to use them (or their customers really want to use them)
VPC is a very convenient fit for enterprise customers extending on-premises networks into the cloud, I think that's the market it's mainly focussed on.
> I know it's "public", but that seems irrelevant in the era of cloud services.
It's not irrelevant, but neither is it necessary critical all the time; there doesn't need to be a one-size- (or even one-shape-)fits-all universal approach to network security, and AWS encompasses a lot of different customer setups, including enterprises for which it is a virtual extensions of the on-premises internal network.
Say you have bunch of ec2 instances with public ip addresses that runs an application that makes calls to 3rd party service. Say that 3rd party service allows only access from certain ip ranges, would you rather give them a single ip or hundreds of ips for them to whitelist?
What you say may be acceptable for small infra but not in large setup.
You need to realize that the point of AWS is lock in. Once your service becomes a ball of various AWS pieces, it becomes almost impossible to leave once you start scaling.
So there is always a priority towards things that cause more lock in like VPC.
I don’t think AWS want you to use VPC at all. The Golden Path for serverless on AWS has always been “networkless”. If your use case fits into their stateless HTTP stack (API Gateway + Lambda + Dynamo + SQS...) then you’re gonna have a really easy time. The reason VPC is required is because not every use case is going to fit into that stack, and the fact that VPC functionality seems to be always just a little bit not good enough (in comparison) doesn’t make me think they’re pushing people towards it.
I have tried building an API using API Gateway <-> Lambda, but had to choose between using DynamoDB to store data (no-SQL, so challenging to query) or suffering unacceptably long response times whenever a request happens to cause a cold-start. Theoretically, this problem is now going away!
[1] https://serverless-stack.com