|
|
|
|
|
by mjarrett
576 days ago
|
|
What kinds of SQL queries could ClickHouse not handle? Were the limitations about expressivity of queries, performance, or something else? I'm considering using CH for storing observability (particularly tracing) data, so I'm curious about any footguns or other reasons it wouldn't be a good fit. |
|
E.G: Clickhouse interval support, which is an important type for observability, was lacking. You couldn't subtract datetimes to get an interval. If you'd compared 2 milliseconds intervals to one second ones, it wouldn't look at the unit and would say 2 ms is bigger, etc. So he had to go to the dev team, and after enough back and forth, instead of fixing it, they decided to return an error and he had to insist for a long time until they actually implemented a proper solution.
Quoting him "But like these endless issues with ClickHouse's flavor of SQL were problematic."
Another problem seemed to be that to benefit from very big scaling with things like data in Parquet at rest + local cache meant basically leaking all your money to AWS because the self-hosted version didn't expose a way to do that yourself. Click house scales fine at my size, so I can only trust him on that front since I'm nowhere that big.
Funnily after that, they moved to TimeScale, and the perfs wouldn't work for their use case.
They landed on DataFusion after a lot of trials and errors.
But really interesting perspective on the whole thing, you can see he is kinda obsessed with the user experience. The guy wrote a popular marshmallow alternative, 2 popular celery alternative and one watchdog popular alternative, all FOSS.
These kind of people are the source of all imposter syndrome in the world.
I'll publish that video next week on Bite Code if I can. If I can't, it will have to wait 3 weeks cause I'm leaving for a bit. But Charlie Marsh's one (uv's author) is up, if you are into overachievers.