Hacker News new | ask | show | jobs
by shaklee3 2289 days ago
I think what you're getting at can be accomplished with materialized views in clickhouse now. Most queries that might be fast with inverted indices can be solved that way.

Also, I don't think they use bloom filters for the index as far as I can tell from the documentation. There is certainly an option to use a bloom filter aggregator on a table for faster counts, but it's not the default. If you're referring to the fact that count () is not precise, there's a exact count function too. This is my speculation, though, and you may be fight.

2 comments

Yeah bloom filters aren't used by default my bad, by default you get no indices at all! I was thinking about the tokenbf/ngrambf indices. These help a ton in improving more sparse queries.

I will need to check out the materialised views. :)

> I think what you're getting at can be accomplished with materialized views in clickhouse now. Most queries that might be fast with inverted indices can be solved that way.

Let's say you have data with a few dozen dimensions, and want to compute aggregations filtered by any user-supplied union or intersection of dimension values. This is a fairly common use case in analytics dashboards. How do materialized views help with that?