Hacker News new | ask | show | jobs
by abarrak 1808 days ago
A couple of days ago, I tried to search on how AWS operates RDS behind the scenes, since it is a managed stateful service I was wondering whether it runs in a traditional way VM-based or in a fully containerized environment? .. Unfortunately, a simple search will lead you to the consumer/customer resources out there only.
2 comments

This is a good paper that talks about Aurora and provides some insight into how RDS operates: https://www.allthingsdistributed.com/files/p1041-verbitski.p...

It’s nice that AWS builds their own higher level abstractions on the same primitives outside developers use. Feels like they eat their own dogfood much more than Google where they bypass GCP and instead utilize underlying Borg primitives for many services.

Which gcp services run directly on borg? My understanding is at least bigtable, cloud sql and other dbs are within “hidden” VMs. I think loadbalancers and storage are exceptions but same is true for aws (except the classic elb probably)
Bigtable, Firestore and Spanner run directly on Borg.

Cloud SQL V2 runs in hidden VMs.

Are those not internal services that pre-date GCP that are exposed externally _through_ GCP?
An AWS engineer talks about this here: https://twitter.com/rakyll/status/1415170934609121285?s=20
Other thing AWS has going in its favor is the rest of Amazon uses AWS versus at Google, none of their flagship services run on GCP (though I think YouTube is moving, which is good).
The rest of amazon doesn’t run on aws in its entirety afaik. Also not sure it’s necessarily a good thing either.
This is a new thing really - it used to be that you'd use a different system that's in many ways better integrated with how the rest of development works but far worse in terms of UX and capacity planning, etc. Now many of the tools are basically frankenstein transformation from the old way into the Amazon-specific way AWS is used via the multi-account pattern.
Nice. I wonder if the stateful merits provided and marketed by containers orchestrates (e.g. K8S) is something they will consider in the future? ..
To build a new service at Amazon, the general path of least resistance these days is to use Lambda. If not Lambda, then ECS. If not ECS (or if it requires bare metal) then EC2.
Yeah, except for the lambda time limit...
Given most things are frontended by API Gateway anyway this isn’t too big of a concern since there’s already an effective 30 second timeout there. For ETL like stuff you can use Stepfunctions too.
It's still sad, because then you need a lambda for just these one or two things that might run long (or maybe even forever) and now you have to mix lambda and e.g. ECS.
Based on how they bill it, it looks like it's running on VMs
Agreed. AWS RDS instance types are just EC2 instance types prefixed with "db." and you're choosing either single-AZ or multi-AZ deployments so presumably AWS is just spinning up 1 to 3 EC2 instances with some preconfigured software on them.
From what I know there is a secret sauce beyond a mere AMI and a control plane, based on some EBS volumes magic. I may be mixing things up with Aurora though.
this is very, very safe to assume. There is probably a hundred engineer's worth of "secret sauce" for an entire managed DB product line.
There were some comments in the early days that the Multi-AZ magic for classic RDS was just drbd on top of EBS.

Aurora is a completely different approach where the RDBMS code is modified to directly interface with EBS instead of going through a traditional OS filesystem layer.

Amazon published a paper describing how Aurora works:

https://www.allthingsdistributed.com/files/p1041-verbitski.p...