| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jhgg 2585 days ago
	We use a rather bespoke syslog -> clickhouse log sink (https://github.com/discordapp/punt/tree/clickhouse) we wrote in house because logstash (and then subsequently elastic starch) was too slow. Would love to switch off of it and to this! Hopefully a clickhouse sink comes soon! Maybe will contribute one upstream!

2 comments

reacharavindh 2585 days ago

Out of curiosity, could you tell us a little more about your log analysis workflow? Once they are in Clickhouse, how do you visualise/search/analyse your logs? What is your equivalent of Kibana?

link

jhgg 2585 days ago

We do rollups into bigquery where we have a bunch of dashboards to look at stuff historically.

I did really like Kibana, ultimately, we had to ditch it (because of ditching ES). Of course, this was a good thing, as I more than once degraded ingest the ES cluster by just using Kibana to do some aggressive filtering. Clickhouse handles these without problem.

I think a more complete world view may be to pipe logs into kafka, and ingest them into Clickhouse/Druid for different types of analysis/rollups.

Our current logging volume exceeds ~10b log lines per day now. Clickhouse handles this ingest almost too well (we have 3 16 core nodes that sit at 5% CPU). This is down from a... 20ish node ES cluster that basically set pegged on CPU... and our log volume then was ~1b/day.

For more ad-hoc, we just use the clickhouse-cli to query the dataset directly. We are tangentially investigating using superset with it.

link

reacharavindh 2585 days ago

Thanks for the response. Lot of tips to go research about for me.

I was mentally debating between trying to find a schema for our logs, and store them in a database where it can be queried efficiently from

Throwing logs into ELasticSearch in a lazy way and let it index the whole thing to enable us do full-text search on logs. But, with a limitation of only have a few days worth of data in ES indexes.

Kibana’s visualisation is what is holding ES up for me. I will look into superset+Clickhouse to see if I can come up with a good analysis front for our log data.

link

binarylogic 2585 days ago

Absolutely, this is likely the next integration we'll be working on. There were a few features schema-wise that we needed support before we started, but we're _very_ close. We'd love beta testers to help us build it out. Feel free to email us if you're interested: vector@timber.io

link

codepodu 2585 days ago

Heyyy you're the one who wrote authlogic! What a blast from the past :) Thank you for your contribution to the Rails community, and congrats on shipping Timber & Vector!

link