Hacker News new | ask | show | jobs
by jhinds 3188 days ago
Neat! I was looking into tracing solutions for our k8s cluster the other day and was going to look into setting up Zipkin. Now I'll this to my list of tools to evaluate. I found this blog post by uber informative https://eng.uber.com/distributed-tracing/ so maybe there is no need to even setting up Zipkin and just start with Jaegar?
5 comments

Most folks will choose either Zipkin or Jaeger, but both are OpenTracing-compatible distributed tracing systems. You might find the Cloud Native Landscape useful for thinking about the options: https://github.com/cncf/landscape/blob/master/README.md

Disclosure: I’m the executive director of CNCF, which just adopted Jaeger 2 weeks ago, and I’m an author of the landscape.

Last I checked "OpenTracing-compatible" only went as far as using common terminology. Tbh I was a bit disappointed by this; has more been defined since? E.g. are there now shared schemas, APIs of sorts?
Yes, OpenTracing is an API, with bindings currently available in 9 languages. Please take a look.

http://opentracing.io/

Only the semantics and libraries implementing them are there. The wire format is not specified, which is a pretty annoying problem to deal with.
It's weird for most people. We're used to cross-language wire protocols. OpenTracing is different.

An analogy is SLF4J for Java logging. All libraries, etc use the same interface and the final user determines the backend: java.util, Logback. This makes sense if you have many authors of libraries with a cross-cutting concern.

This really makes OpenTracing half a dozen different standards, one per language, with common semantics.

Should it be about a wire protocol instead? Discussion at https://github.com/opentracing/specification/issues/34

There are discussions about it happening. If you haven't yet, join the community and make your opinion heard :)
"OpenTracing-compatible" is strict API compatibility in any supported language. The cross-language spec is "terminology-based" since it's, well, cross-language.

There is an open issue about Envoy/linkerd/Istio support here: https://github.com/opentracing/specification/issues/86 (as well as in a number of other locations)

As an OpenTracing contributor, the core value prop still seems quite strong in that instrumentation of OSS dependencies is a massive pain point and should not be tracing-system-specific since it doesn't need to be. There is also value in common protocols and formats, and in that spirit there is interest in broadening scope to include those... though from seeing many companies adopt tracing tech, I haven't observed protocol compatibility as the main pain point or blocker.

I have used that landscape document as a very informative point of reference for the last couple of months. Thank you for creating it.
You're very welcome. You cannot imagine the amount of criticism it has engendered.
How do you plan on keeping it sane over time? Even disregarding paring down the number of times managed major providers appear on the list (which is super sensible imo), won't it start to get quite crowded with all the open source competitors appearing?
The guidelines we're following are at: https://github.com/cncf/landscape#cloud-native-landscape-pro...

The CNCF storage WG is also looking at creating a "zoomed in" version of the storage section with higher fidelity information. That's one model of providing more detail.

We also have an interactive version of the landscape coming that will provide filtering, zooming, etc.

If you want to trace applications deployed on Kubernetes, you might benefit from Jaeger's Kubernetes templates: https://github.com/jaegertracing/jaeger-kubernetes
Last week I did a test of Linkerd with Zipkin for k8s clusters. Works like a charm, buy still a bit more work to use Jaeger as Zipkin replacement, as they do not support the same protocols yet. I believe https://github.com/linkerd/linkerd-zipkin will fix that but haven't tested yet.
I see Zipkin is a Java app, without sounding like I’m hating on a language for no reason here - but I wonder if it’s awfully heavy and slow to launch like so most other java apps? By comparison I’d expect a tracer written in Go would be significantly more efficient.
Does slow startup time matter for services that are supposed to be running continuously in a cluster environment?

I have no doubt that a Go tracer would start orders of magnitudes faster than a Java one (especially if it pulls in spring or other web-related dependencies for the zipkin UI) but I think it is irrelevant.

it's irrelevant. if you take a look at the diagrams in the blog post that include zipkin, it's is basically only the query/frontend portion - the tracing itself is done natively in the language the code is written in, and outputted to cassandra. The zipkin part is a long running server that just needs to query cassandra.
I am evaluating Google Stackdriver as well, with a zipkin collector. Will see...