Hacker News new | ask | show | jobs
by tnolet 732 days ago
A recent example of OTel confusion.

I could for the life of me not get the Python integration send traces to a collector. Same URL, same setup same API key as for Nodejs and Go.

Turns out the Python SDK expect a URL encoded header, e.g. “Bearer%20somekey” whereas all other SDKs just accept a string with a whitespace.

The whole split between HTTP, protobuf over HTTP and GRPC is also massively confusing.

2 comments

The silent failure policy of OTEL makes flames shoot out of the top of my head.

We had to use wireshark to identify a super nasty bug in the “JavaScript” (but actually typescript despite being called opentelemetryjs) implementation.

And OTEL is largely unsuitable for short lived processes like CLIs, CI/CD. And I would wager the same holds for FaaS (Lambda).

In the end I prefer the network topology of StatsD, which is what we were migrating from. Let the collector do ALL of the bookkeeping instead of faffing about. OTEL is actively hostile to process-per-thread programming languages. If I had it to do over again I’d look at the StatsD->Prometheus integrations, and the StatsD extensions that support tagging.

> And OTEL is largely unsuitable for short lived processes like CLIs, CI/CD. And I would wager the same holds for FaaS (Lambda).

Not necessarily true f.ex. in one of my hobby Golang projects I found out that you can cleanly shutdown the OTel collector so it flushes its backlog of traces / metrics / logs so I was able to get telemetry reading even for CLI tool invocations that lasted 5-10 secs (connect to servers, get data, operate on it, put it someplace else, quit).

But now that you mention it, it would be nasty if that's not the default behavior indeed.

> OTEL is actively hostile to process-per-thread programming languages

Can you explain why, please?

Yeah. And Otel has actually pretty nice debugging. You just need to set the right environment variable. But on prod it will blow up your logs
Sounds like a problem with the Python sdk
Well actually. They (python SDK maintainers) argue their implementation is the correct one according to the spec. See this issue thread for example.

https://github.com/open-telemetry/opentelemetry-specificatio...

There are more. This is a symptom of a how hard it is to dive into Otel due to its surface area being so big.

> Well actually. They (python SDK maintainers) argue their implementation is the correct one according to the spec. See this issue thread for example.

The comment section of that issue gives out contrarian vibes. Apparently the problem is that the Python SDK maintainers refuse to support a use case that virtually all other SDKs support. There are some weasel words that try to convey the idea that half the SDKs are with Python while in reality the ones that support the choices followed by the Python SDK actually support all scenarios.

From the looks of it, the Python SDK maintainers are purposely making a mountain out of a molehill that could be levelled with a single commit with a single line of code.

I guess you word it better than I did.

As a user it feels very weird to wade into threads like this to find a solution to your problem.

The power of Otel is it being an open standard. But the practice shows the implementation of that standard / spec leads to all kinds of issues and fiefdoms