Hacker News new | ask | show | jobs
by sergei 4570 days ago
Several problems here:

1. Unlike your dataset, the tpcc dataset for the benchmark was not memory resident. Total dataset size was 786G. Just shrinking the dataset to fit in memory would substantially change the numbers.

2. The tpcc workload is much more complex than your benchmark. It doesn't make sense to compare a tpcc transaction, of which there are multiple flavors, to your workload.

3. All Clustrix write transactions are multi-node transactions. There's a 2x data redundancy in the test. We do not have a master-slave model for data distribution. Docs on our data distribution model: http://docs.clustrix.com/display/CLXDOC/Data+Distribution.

-----

Now we've also done tests where the workload is a simple point updates and select on memory resident datasets. For those kinds of workloads, a 20 node cluster can do over 1M TPS.