Hacker News new | ask | show | jobs
by cachemiss 1701 days ago
Having used both TSDB and ClickHouse in anger I have some thoughts on this:

They are both fantastic engines, I really like that both have made very specific tradeoffs and can be very clear in what they are good and bad at. Having worked on database engines, I can appreciate the complexity that they are solving.

My most recent use is with ClickHouse, which is great and I think a complete game-changer for the company. However there's a lot of issues (that are being worked on, the core team is great, though there are a few personalities that are a bit frosty to deal with). All of these comments come with love for the system.

1. Joins really need some work, both in the kinds of algorithms (pk aware, merge joins that don't do a full sort etc.), and in query optimizer work to make them better. We have analysts that use our system, and telling them to constantly write subqueries for simple joins is a total PITA. Not having PK aware joins is a massive blocker for higher utilization at our company, which really loves CH otherwise.

2. Some personalities will tell you that not having a query optimizer is a feature, and from an operational standpoint, it is nice to know that a query plan won't change, or try and force the optimizer to do the right thing. However, given #1, making joins performant (we have one huge table with trillions of rows, and a few smaller ones with billions) is really rough.

3. The operations story really needs some work, especially the distribution model. The model of local tables with a distributed table over it is difficult to work with personally. It would be nice to just be able to plug servers in without alot of work, like Scylla, and not have two tables that you have to keep schemas consistent with. THere's also just some odd behavior, like if you insert async into a distributed table, and only have a few shards, it'll only use a thread per shard to move that data over. It would be nice if there wasn't as much to think about.

4. Following #3, there's just too many knobs, maybe if they had a tuning tool or something that would help, but configuring thread pools is difficult to get right. I suspect CH could use a dedicated scheduler like Scylla's, that could dispatch the work, instead of relying on the OS.

5. The storage system relies a lot on the underlying FS and settings on when to fsync etc. I suspect if they had a more dedicated storage engine (controlled by the scheduler above), things could be more reliable. I still don't fully trust data being safe with CH.

6. Deduplication - This is a hard problem, but one that is really difficult to solve in CH. We solve it by having our inserters coordinate so that they always produce identical blocks, using replacing merge trees to catch stragglers (maybe), but it isn't perfect. A suggestion if possible is to try and put the same keys into the same parts, so they'll always get merged out by the replacing merge tree (I understand this is difficult).

The CH team is great, and these will be fixed in time, but these were the problems we ran into with CH.

TSDB was really solid, but we never used it at a scale where it would tip over. Our use case is really aligned with Yandex's so a lot of the functionality they have built is useful to us in a way that TSDB's isn't. (Also, being able to page data to S3 is amazing).

2 comments

Thanks for your excellent contribution to this discussion. As the post author I wholly agree with your approach: if a solution hits the sweet spot for you in the context of your requirements that's the one you choose. Thank you for considering TimescaleDB alongside ClickHouse in what was obviously a well thought through assessment of these two excellent technologies.
Heh - somehow missed that I had already responded to this one, my apologies. (and no immediate way to edit after the fact).
(post author)

Thanks for the great, thoughtful feedback. We (Timescale) couldn't agree more that there is a lot to love about ClickHouse, especially where it truly excels.

Information like this is helpful for others currently in the "choose the right tool" part of the job and to the developers of the product. I can't imagine how different all of our offerings will look in a few more years! :-)