Hacker News new | ask | show | jobs
by snikch 3405 days ago
Amazon's Aurora databases seem to be solving the same problem, and are MySQL or Postgres compatible to boot.
5 comments

Aurora is very cool but won't help you much after you vertically scale your master and still need more write capacity. With Cloud Spanner you get horizontal write scalability out of the box. Critical difference.
So if I'm understanding you, with Aurora all writes go to one master and you're constrained by the biggest instance AWS offers. Is that right?

Do you have a sense of what that limit is?

There's a pretty big price difference between Spanner and Aurora at the entry level so it's useful to explore this.

> Do you have a sense of what that limit is?

Per their pricing page[1] it looks like the largest instance available is a "db.r3.8xlarge", which is a special naming of the "r3.8xlarge" instance type[2] which is 32 cpus and 244gb of memory.

That's a hell of a lot of capacity to exhaust, especially if you're using read replicas to reduce it to only/mostly write workloads. Obviously it's possible to use more than this, but the "sheer scale" argument is a bit of a flat one.

[1] https://aws.amazon.com/rds/aurora/pricing/ [2] https://aws.amazon.com/ec2/instance-types/#r3

Wouldn't the write master be I/O-bound, rather than CPU- or memory-?
I disagree, the "sheer scale" argument is not flat. The fact that one can scale horizontally and the other can't is significant.

Let me present a quote to you: 512 kb ram ought to be enough for everybody

You can disagree on that if you'd like, but note that I explicitly acknowledged the possibility of exceeding these limits. In my opinion, for most cases/workloads, it's highly unlikely that you will and designing for that from the outset is a waste of time and resources.
Yes, Aurora has a single write master, though it does have automatic write failover -- i.e. if the Aurora primary dies, one of your read replicas is promoted to the primary and reads/writes are directed to the new instance. That does constrain your primary's capabilities to the largest instance size (currently a db.r3.8xlarge).

I don't have a good idea what the upper limit is for an Aurora database setup.

How does Aurora know that the primary is dead? Automatic failover is problematic in a distributed system.
AWS uses heartbeats for detecting liveliness. If x heartbeats fail the failover procedure is started. Generally 10s - 5minutes. In practice (for me) the failover has been less than 15s.
My concern was more around split brain. If you fail over while the write master is simply unreachable, pain results.
Yeah, the latency on that failover isn't specified.
Do you mean the amount of time it takes to initiate a failover or the amount of time for a failover to complete?

For the former, I don't think they specify beyond "automatic".

For the latter, "service is typically restored in less than 120 seconds, and often less than 60 seconds": http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora...

Amazon provides a testing methodology here: https://d0.awsstatic.com/product-marketing/Aurora/RDS_Aurora... which might be useful to explore when benchmarking the two services against each other.
Aurora is a 'better MySQL mousetrap', IMO.

This is a globally-available, nearly-CAP-beating datastore that powers one of the biggest websites on the internet.

It's not quite apples and oranges, but this is definitely a different problem they are solving.

That's vague. AWS also powers huge websites and Amazon is recommending Aurora as the "default choice" for most workloads.[1] There are certainly significant architectural differences but I would say we can definitely make a direct practical comparison.

[1] http://www.computerworld.com/article/2953299/cloud-computing...

If Aurora powers huge websites, spanner is for ginormous websites. Think a multiplier to netflix's database needs.
Curious to know what are Netflix's needs for relational database?

Doesn't strike me as a business with complex logic.

Netflix mainly uses Cassandra as their database.

And their needs are reasonably complex. They use machine learning and big data analytics to generate the list of videos that you should be watching. In order for those to work they need to capture a whole raft of end user metrics e.g. at what point you paused video X.

I'd assume they keep track of who watches what for their 'continue watching series...' pain.

Netflix was given as an example of scale. I guess for another example, spanner could be used to store every visa transaction

While Aurora doesn't provide true horizontal scalability, the same-node scalability seems so strong it might allow many companies to stay single-node for quite a while.

For example, see this benchmark:

http://2ndwatch.com/wp-content/uploads/2016/09/Graph-3.jpg

from this article:

http://2ndwatch.com/blog/benchmarking-amazon-aurora/

Thoughts?

Aurora's other zone replicas are read-only. Probably no atomic clocks and GPS for time synchronization.

To be fair, Spanner's cross-region service is coming "later 2017".

It is not close to equivalent. But I do want to get a better feel for if Google really has figured how to do basically the impossible. I want to see if this truly scales horizontally but of it does then competitors better hope for a much more detailed paper :)
> It is not close to equivalent.

It's equivalent, with different (unknown) constraints. Aurora is specifically for scaling workloads in the same way. You can say it's horizontal (machine) over vertical (resource) but it's all a matter of accounting.

The big nono is the Spanner pricepoint. I will stick with Aurora for scaling based on traffic I use, over pricey timeslices.

You would have to have quite a load to justify the switch from cheaper de jour solutions right now (AWS). Relying on the few that do, is a risk.