|
|
|
|
|
by qaq
1705 days ago
|
|
and then it will come down to spec of nodes actual fields and so on etc. Also batch size obviously plays a big role here as
CH is optimized for very large batch sizes and benchmark is not really using that kind of batch size. BTW. I am not involved with CH but any kind of vendor benchmarking their wears will always select params that will make their offering look good |
|
Interestingly, we were just testing a multi-node TimescaleDB cluster the other day and found that 75k rows/batch was the optimal size as nodes increased.
So you're completely correct. I tried to be very clear that we were not intentionally "cooking the books" and there's surely other optimizations we could have made. Most of the suggestions so far, however, require further setup of CH features that haven't been used in other benchmarks, so we tried to over communicate our strategy and process.
We also fully acknowledged in the post that an siloed "insert", wait, then "query" test is not real world. But, it's the current way TSBS has been used and other DB engines have come along and used the methodology for now. Maybe that process will change in time to come with other contributions.
BTW, we'll discuss some of this next week during the live-stream and the video will be available after.