Firstly, not all spans are interesting. When 99.99% of your traffic is just going to serve up an HTTP 200 within your acceptable latency threshold, you don't need every one of those. You probably do want to keep 100% of error spans, or those where the root has a duration beyond a configured threshold. There's tools to be able to sample that way.
Secondly, there's ways to also attach your effective sample rate as metadata to spans, and if there's a backend that supports re-weighting counts based on that, you can still get accurate all-up counts of overall traffic.
Admittedly, OTel and many other backends don't have the best story for this yet. But it's getting better.
While I would like to ingest every one, cost is a factor.
Even if we were self-hosting, there's a cost to ingesting and storing every single span.
And even if we are able to pay for ingesting 100%, not everything is practical to be ingested 100%. Our most common request type (heartbeat) generate a span payload size that is a multiple of the original request. We're using Elixir in production, and those can absorb a tremendous amount of traffic, saturating the entire CPU capacity of the hardware if we let it. The agents are not capable of keeping up.
Firstly, not all spans are interesting. When 99.99% of your traffic is just going to serve up an HTTP 200 within your acceptable latency threshold, you don't need every one of those. You probably do want to keep 100% of error spans, or those where the root has a duration beyond a configured threshold. There's tools to be able to sample that way.
Secondly, there's ways to also attach your effective sample rate as metadata to spans, and if there's a backend that supports re-weighting counts based on that, you can still get accurate all-up counts of overall traffic.
Admittedly, OTel and many other backends don't have the best story for this yet. But it's getting better.