| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by orhanhh 2191 days ago
	I recently benchmarked TiDB (pre 4.0), CockroachDB and YugabyteDB. TiDB outperformed the others for write throughout and latency, probably because it has a bit weaker isolation guarantees (snapshot vs serializable). However, for all read operations, CockroachDB performed significantly better than TiDB. It would be interesting to do a new comparison with version 4.0, as it seems from this article that they have improved performance quite a bit.

2 comments

sanxiyn 2191 days ago

I mean, CockroachDB does not target OLAP use cases, but TiDB does. This is a fundamental design choice with tradeoffs, so I think some performance gap for OLTP use cases is expected.

link

nujabe 2191 days ago

How did you benchmark?

link

orhanhh 2191 days ago

I used oltpbenchmark and ran automated tests on hetzner cloud using Terraform and some automation scripts. The comparisons are based on the YCSB workloads executed by oltpbenchmark.

link

youjiali1995 2189 days ago

Hi, I want to reproduce it. Could you tell me how much scalefactor did you use and did you change the default weights?

link

orhanhh 2184 days ago

I used 100 as the scale factor and used different weights for benchmarks A through F, defined by the original YCSB project here: https://github.com/brianfrankcooper/YCSB/tree/master/workloa...

link

shenli3514 2184 days ago

May I know how many instances are there in crdb/tidb cluster and the concurrency in the benchmark? We found that for small clusters (for example 3 instances), CRDB could be fast in read-only workload. Because CRDB is single binary, about 1/3 read operation will not involve RPC. TiDB need to involve a RPC for every request. For larger scale cluster or high concurrency, it is a different story.

link

orhanhh 2183 days ago

I tested clusters with between 3 and 12 nodes, and the differences were similar for the different sizes. I’m not sure how it performs for larger clusters than that though. Additionally, the results might have been a bit misleading on the large clusters because of the low scale factor, leading to higher contention on some rows.

link