| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jhgg 2586 days ago

We do rollups into bigquery where we have a bunch of dashboards to look at stuff historically.

I did really like Kibana, ultimately, we had to ditch it (because of ditching ES). Of course, this was a good thing, as I more than once degraded ingest the ES cluster by just using Kibana to do some aggressive filtering. Clickhouse handles these without problem.

I think a more complete world view may be to pipe logs into kafka, and ingest them into Clickhouse/Druid for different types of analysis/rollups.

Our current logging volume exceeds ~10b log lines per day now. Clickhouse handles this ingest almost too well (we have 3 16 core nodes that sit at 5% CPU). This is down from a... 20ish node ES cluster that basically set pegged on CPU... and our log volume then was ~1b/day.

For more ad-hoc, we just use the clickhouse-cli to query the dataset directly. We are tangentially investigating using superset with it.

1 comments

reacharavindh 2585 days ago

Thanks for the response. Lot of tips to go research about for me.

I was mentally debating between trying to find a schema for our logs, and store them in a database where it can be queried efficiently from

Vs

Throwing logs into ELasticSearch in a lazy way and let it index the whole thing to enable us do full-text search on logs. But, with a limitation of only have a few days worth of data in ES indexes.

Kibana’s visualisation is what is holding ES up for me. I will look into superset+Clickhouse to see if I can come up with a good analysis front for our log data.