What is the difference w.r.t the comparison done by Altinity of clickhouse with timescale ? Clickhouse performed better there for the same test. What gives ?
Thank you. My only nit is the way the ratio (CH/TS) is shown. What is the purpose of that ? It will show a bigger percentage for cases in which TS is better, but lower percentage for cases where CH is giving better results. From the data representation perspective, I do not thinnk that is fair.
The two big things, which we discuss at length in the post, are:
- Altinity (and others) did not enable compression in TimescaleDB (which converts data into columnar storage) and provides improvement in querying historical data because it can retrieve individual columns in compressed format similar to CH
- They didn't explore different batch sizes to help understand how each database is impacted at various batch sizes.
Have you from your side followed all Clickhouse best practices?
Clickhouse design in particular suggests doing ingest request approximately once per second and if you do much more than that when you use it outside of intended usage and if you need that you usually have some sort of queue between whatever produces the data and Clickhouse.
Note the ingest in small batches also can significantly affect query performance
Yep - it's all detailed in the post! The question is how it compares to TimescaleDB, which is an OLTP time-series database that has a lot of other possible use cases (and extensibility). I think it's very fair to explore how smaller batches work since others haven't ever actually shown that (as far as we can see) so that users that would normally be coming from a database like PostgreSQL can understand the impact something like small batches would have.
As for ingest queueing, TSBS does not queue results. We agree, and tell most users that they should queue and batch insert in larger numbers. Not every app is designed that way and so we wanted to understand what that would look like.
But CH did amazingly well regardless of that with batches above 1k-2k and lived up to it's name as a really fast database for ingest!
That post was written in November 2018 - 3 years ago - when TimescaleDB was barely 1.0.
A lot has changed since then:
1. TimescaleDB launched native columnar compression in 2019, which completely changed its story around storage footprint and query performance [0]
2. TimescaleDB has gotten much better
3. PostgreSQL has also gotten better (which in turn makes TimescaleDB better)
In fact, IIRC Altinity used and contributed ClickHouse to the TSBS [1], which is also what this newer benchmark uses as well
(Disclaimer: TimescaleDB co-founder)
[0] https://blog.timescale.com/blog/building-columnar-compressio...
[1] https://github.com/timescale/tsbs