...huh? I work with customers who (through a mistake) have created literally multi-million span traces using OTel. Are you referring to a particular backend?
Well that's a shame, I'm going to ask some folks about that. 500 spans per trace is ridiculously small and I can't imagine any good reason to have that limitation since it's just not that big of a footprint.
OTel doesn't define any limits on the # of spans in a trace (nor the # of attributes on a span!) but it will be bound by the limits of whatever backend you use. In the case of the one I work for, we do limit the total size of a span to be 1MB or less with 64KB per attribute before truncation. Other backends have different limitations. This is the first I've heard of such a small limitation on the total number of spans in a trace though. Traces are just (basically) collections of structured logs with in-built correlation IDs. I can't imagine why you'd limit them like this.
That was two years ago (we tried spans before metrics), so it’s fuzzy. I believe the collector sidecar was fine with it but the backend was not, which complicated debugging. There’s not a clear feedback path in OpenTelemetry that we could find. I completely forgot to mention the tendency toward silent failures. That’s a cardinal sin for telemetry. I would take it out back and shoot it for that fact alone.
The other problem I noticed looking at the wire protocol was that the data for the parent trace doesn’t seem to get sent until the trace closes. That seems like a bookkeeping nightmare to me. There should be a start of trace packet and an update at the end. I shouldn’t have finished spans showing up before the parent trace has been registered. And that’s what it looked like in the dumps my OPs people sent me to debug.
Practically a given outcome, then; we could knock their Managed Prometheus offering off the Internet on the regular. It was just laughable for a company that prides itself in one trillion IAM transactions to 429 some metric ingest