Hacker News new | ask | show | jobs
by ignoramous 1219 days ago
Not OP but they might be referring to Uber moving from ES to ClickHouse to store their schema-flexible, structured logs, mostly to improve ingestion performance: https://archive.is/bFsTF / https://www.uber.com/blog/logging/

The gist of it is:

- Structured logs (json) are stored as kv pairs in parallel arrays, along side metadata (host, timestamp, id, geo, namespace, etc).

- Log fields (ie kv pairs) are materialized (indexed) depending on query patterns, and vaccummed up if unused.

- Authoring queries and Kibana dashboard support is not trivial but handled with a query translation layer.

1 comments

What do you mean by parallel arrays here?

Do you mean something like two arrays [k1, ..., kN] and [v1, ..., vN] in two different columns?

Is there a way in Clickhouse to filter such a pair of arrays such that you can do a search akin to vals[indexOfKey("foo")] == "bar"?

Yep. If you read the blog post I linked to it does talk a tonne about ClickHouse and what it can do (like indexOf, for example).
Ah, of course. Thanks!