| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by snikch 3452 days ago
	Amazon's Aurora databases seem to be solving the same problem, and are MySQL or Postgres compatible to boot.

5 comments

tedd4u 3452 days ago

Aurora is very cool but won't help you much after you vertically scale your master and still need more write capacity. With Cloud Spanner you get horizontal write scalability out of the box. Critical difference.

link

abalone 3452 days ago

So if I'm understanding you, with Aurora all writes go to one master and you're constrained by the biggest instance AWS offers. Is that right?

Do you have a sense of what that limit is?

There's a pretty big price difference between Spanner and Aurora at the entry level so it's useful to explore this.

link

awj 3452 days ago

> Do you have a sense of what that limit is?

Per their pricing page[1] it looks like the largest instance available is a "db.r3.8xlarge", which is a special naming of the "r3.8xlarge" instance type[2] which is 32 cpus and 244gb of memory.

That's a hell of a lot of capacity to exhaust, especially if you're using read replicas to reduce it to only/mostly write workloads. Obviously it's possible to use more than this, but the "sheer scale" argument is a bit of a flat one.

[1] https://aws.amazon.com/rds/aurora/pricing/ [2] https://aws.amazon.com/ec2/instance-types/#r3

link

euyyn 3452 days ago

Wouldn't the write master be I/O-bound, rather than CPU- or memory-?

link

computerex 3451 days ago

I disagree, the "sheer scale" argument is not flat. The fact that one can scale horizontally and the other can't is significant.

Let me present a quote to you: 512 kb ram ought to be enough for everybody

link

awj 3451 days ago

You can disagree on that if you'd like, but note that I explicitly acknowledged the possibility of exceeding these limits. In my opinion, for most cases/workloads, it's highly unlikely that you will and designing for that from the outset is a waste of time and resources.

link

edaemon 3452 days ago

Yes, Aurora has a single write master, though it does have automatic write failover -- i.e. if the Aurora primary dies, one of your read replicas is promoted to the primary and reads/writes are directed to the new instance. That does constrain your primary's capabilities to the largest instance size (currently a db.r3.8xlarge).

I don't have a good idea what the upper limit is for an Aurora database setup.

link

macintux 3452 days ago

How does Aurora know that the primary is dead? Automatic failover is problematic in a distributed system.

link

CaveTech 3452 days ago

AWS uses heartbeats for detecting liveliness. If x heartbeats fail the failover procedure is started. Generally 10s - 5minutes. In practice (for me) the failover has been less than 15s.

link

macintux 3452 days ago

My concern was more around split brain. If you fail over while the write master is simply unreachable, pain results.

link

xapata 3452 days ago

Yeah, the latency on that failover isn't specified.

link

edaemon 3452 days ago

Do you mean the amount of time it takes to initiate a failover or the amount of time for a failover to complete?

For the former, I don't think they specify beyond "automatic".

For the latter, "service is typically restored in less than 120 seconds, and often less than 60 seconds": http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora...

link

ChuckMcM 3452 days ago

Amazon provides a testing methodology here: https://d0.awsstatic.com/product-marketing/Aurora/RDS_Aurora... which might be useful to explore when benchmarking the two services against each other.

link

manacit 3452 days ago

Aurora is a 'better MySQL mousetrap', IMO.

This is a globally-available, nearly-CAP-beating datastore that powers one of the biggest websites on the internet.

It's not quite apples and oranges, but this is definitely a different problem they are solving.

link

abalone 3452 days ago

That's vague. AWS also powers huge websites and Amazon is recommending Aurora as the "default choice" for most workloads.[1] There are certainly significant architectural differences but I would say we can definitely make a direct practical comparison.

[1] http://www.computerworld.com/article/2953299/cloud-computing...

link

cobookman 3452 days ago

If Aurora powers huge websites, spanner is for ginormous websites. Think a multiplier to netflix's database needs.

link

dzhiurgis 3452 days ago

Curious to know what are Netflix's needs for relational database?

Doesn't strike me as a business with complex logic.

link

threeseed 3452 days ago

Netflix mainly uses Cassandra as their database.

And their needs are reasonably complex. They use machine learning and big data analytics to generate the list of videos that you should be watching. In order for those to work they need to capture a whole raft of end user metrics e.g. at what point you paused video X.

link

cobookman 3452 days ago

I'd assume they keep track of who watches what for their 'continue watching series...' pain.

Netflix was given as an example of scale. I guess for another example, spanner could be used to store every visa transaction

link

rattray 3452 days ago

While Aurora doesn't provide true horizontal scalability, the same-node scalability seems so strong it might allow many companies to stay single-node for quite a while.

For example, see this benchmark:

http://2ndwatch.com/wp-content/uploads/2016/09/Graph-3.jpg

from this article:

http://2ndwatch.com/blog/benchmarking-amazon-aurora/

Thoughts?

link

xapata 3452 days ago

Aurora's other zone replicas are read-only. Probably no atomic clocks and GPS for time synchronization.

To be fair, Spanner's cross-region service is coming "later 2017".

link

johnsmith21006 3452 days ago

It is not close to equivalent. But I do want to get a better feel for if Google really has figured how to do basically the impossible. I want to see if this truly scales horizontally but of it does then competitors better hope for a much more detailed paper :)

link

jack9 3451 days ago

> It is not close to equivalent.

It's equivalent, with different (unknown) constraints. Aurora is specifically for scaling workloads in the same way. You can say it's horizontal (machine) over vertical (resource) but it's all a matter of accounting.

The big nono is the Spanner pricepoint. I will stick with Aurora for scaling based on traffic I use, over pricey timeslices.

You would have to have quite a load to justify the switch from cheaper de jour solutions right now (AWS). Relying on the few that do, is a risk.

link