Hacker News new | ask | show | jobs
by t0mas88 4479 days ago
The "one million" number grabs some attention, but this isn't that special in my opinion. Doing 3333 writes/sec per node is not that hard with Cassandra, actually it can be much faster if you use a setup with fast local storage and split for example the data disk from the commit log. The Google article reads as if they used network storage and 1 volume per node, both are bad ideas for Cassandra as documented by Datastax.
1 comments

Keep in mind, it's not "One Million Writes Per Second," it's "One Million Writes Per Second on Google Compute Engine" with "Google Compute Engine" being the key point to the article.

The "one million writes per second" for Cassandra has been written about before (in this case, on AWS): http://techblog.netflix.com/2011/11/benchmarking-cassandra-s...

It is worth noting GCE is more expensive now that AWS was back in 2011.

According to Netflix article the AWS experiment did run at a cost $561 per 2h, that is ~$280 per hour. Perhaps they were not utilized the cluster fully in those 2h in which case we should multiply the 1h test that performed 500k inserts per second, in that case the cost would be $182*2 = ~$365 per 1h.

GCE test did run at the cost of $330 per hour. Give or take few dollars difference if anything it's surprising GCE can do at roughly the same cost what AWS was capable of 2+ years ago.

Saying all that GCE guys did a great effort. I wonder though how much speed you can squeeze from AWS and at what cost now when AWS is sporting SSD disks.

Hi, The cost we published includes the time to setup the whole cluster, warm up the data nodes, and run for 5 minutes at 1M per second.

Our run rate is $281 per hour, which is the same as AWS a couple of years back. What changed is that we are using quorum commit, the data is encrypted at rest, we have very low tail latency, and we look at all samples when computing that.

Computing our price is easier because we do not charge per access.

Here is the formula for our run rate:

30 loaders (n1-highcpu-8) at $0.522 per hour: 248.7

300 nodes (n1-standard-8) at $0.829 per hour: 15.66

300 1TB PDs that run at 0.055555556/hour: 16.67 Total: 281.03

But keep an eye on us. This is for today prices.

That post doesn't mention anything about tail latency, while the GCE thing does point to P95 latency < 100ms consistently, which is nice.
I wrote the test - Yep. Tail latency is one of the key things here. And I took 100% of all samples, as opposed to the middle 80% the tool usually reports.
What was the network utilization during the test? If these machines were lightly loaded (< 30% utilized) then the tail latency isn't surprising. :)
Network average utilization was low by design. Keeping it steady was more important than low, though, and harder too.

Latency spikes come from Cassandra flushing data to disk (large sequential IO), Java garbage collection and heap resize, and page faults during compactions (random reads).

What I did to even traffic out was to enable trickle_fsync and size the flushes, set Java's max and min heap sizes, as well as to tune the Java heap ergonomics. I treated random reads as a fact of life - I did nothing to tune that.

Doesn't GCE run on the same (physical, not logical) network as the rest of Google's production systems? If so, which I believe is the case, how can you control for network utilization?