Hacker News new | ask | show | jobs
by tecoholic 40 days ago
I was looking into this just yesterday. So the Loki + … comparison is a bit off in the Open Source space. The main ones are Signoz and ClickStack in this space. Both using ClickHouse as the database. Heavy compared to something like Loki, but they are OTEL native and not log monitoring. So not in the same category.
3 comments

I used Signoz + Clickstack on a vibe coded Go server project a few weeks ago. I just made codex figure out how to setup signoz + dependencies via docker compose. I even got it to pre-populate signoz with dashboards. It wasn't too bad. The whole thing runs with a few GB. I tried to cover metrics, tracing, and logging at the same time. This is not a production ready setup but you need to trade off cost vs. utility here. If it's useful enough, that could justify extra cost.

I have a background in having done a lot of stuff on the Elastic stack related to this; including setting up a big Elastic Fleet based stack for one client at some point. It might not be the cheapest, but it does provide awesome filtering and querying capabilities. However, a lot of teams that use it don't really know how to tap into that capability so it tends to be overengineered for what it does in the end. And the extra, underutilized complexity is why a lot of teams are wary of dealing with that stack.

Storing the data is the easy part but what's the point if you can't run queries against it and produce dashboards and diagnostic tools that actually help you? Prometheus/grafana or older graphite type setups tend to be compromises where you get lots of data but are then limited on the querying front or the number of metrics. The tradeoff is always between scale and querying flexibility. If you store tens/hundreds of GB of telemetry per day, you need a way to make sense of it. Clickhouse seems to be quite good at scaling and querying. It's basically a column database. I don't have direct experience with Loki.

But in the end, all that power only matters if people actually use it. And, again, in my experience teams tend not to. They tend to have a lot of unrealized aspirations around their tools and infrastructure. If it's just a dumping ground for data + a few simplistic dashboards, optimize for that. A lot of that data is actually only kept for compliance/auditing reasons. For that, querying is usually a secondary concern and it's OK if queries take a bit longer and are less powerful.

I agree. The sentiment applies to most analytics. People who setup analytics are not the same as end users.
You're absolutely on point with this, I've made the perf tracking opinionated, so it comes preconfigured with SLOs that are good for most of the projects where nobody would bother to set them up.

Traceway has custom dashboards, supports otel logs/traces/metrics/exceptions fully, has session replays for web and flutter (working on ios/android now), has alerting integrations with slack/email/github, oauth login w google/github, and a bunch of other features... All MIT. None behind a paywall.

It has a specific set of trade offs, those are by design, but I am also always open to changing them and improving it. If you try it and have any thoughts the git issues are constantly monitored.

Agreed, it's a trade-off I am ok with for now.

In reality it's a very modular system, the telemetry repositories can be swapped out easily, I have implemented a clickhouse and a sqlite version (to simplify self hosting) so adding a loki like repository would be a breeze. It's not on the roadmap currently as I am putting a lot of effort into 3 diff parts rn.

The truth is that Clickhouse is an incredible DB that scales really well for observability data.

I'm partial to open observe, especially because in Ruby the OTEL stuff isn't great for metrics and logs yet.
I also run open observe at home, but I can't help but feel that the interface could use some... sparkle, and the mobile experience kinda sucks.

But you can't beat the excellent price and performance. Does what I need and much more

When I was starting Traceway I was heavily inspired by skylightio from the Ruby ecosystem. I loved their SLOs/ranking perf issues, but I also wanted the features that Sentry offered in one place.