|
|
|
|
|
by felipe_aramburu
2673 days ago
|
|
Ok. Right now we are in tunnel vision mode to get our distributed version out by GTC in mid march. We will benchmark against clickhouse sometime in March. Do you know of any benchmark tests that are a bit more involved in terms of query complexity? We are most interested in queries where you can't be clever and use things like indexing and precomputed materializations. The more complex the query the less you can rely on being clever and the more the guts need to be performant and that is more important to us right now. |
|
We use the DTC airline on time performance dataset (https://www.transtats.bts.gov/tables.asp?DB_ID=120) and Yellow Taxi trip data from NYC Open Data (https://data.cityofnewyork.us/browse?q=yellow%20taxi%20data&...) for benchmarking real-time query performance on ClickHouse. I'm working on publishing both datasets in a form that makes it easy to load them quickly. Queries are an exercise for the reader but see Mark Litwintschik's blog for good examples of queries: https://tech.marksblogg.com/billion-nyc-taxi-clickhouse.html.
We've also done head-to-head comparisons on time series using the TSBS benchmark developed by the Timescale team. See https://www.altinity.com/blog/clickhouse-timeseries-scalabil... for a description of our tests as well as a link to the TSBS Github project.