| Your feelings are spot on. In most modern distributed tracing, "observability", or similar systems the write amplification is typically 100:1 because of these overheads. For example, in Azure, every log entry includes a bunch of highly repetitive fields in full, such as the resource ID, "Azure" as the source system, the log entry Type, the source system, tenant, etc... A single "line" is typically over a kilobyte, but often the interesting part is maybe 4 to 20 bytes of actual payload data. Sending this involves HTTP overheads as well such as the headers, authentication, etc... Most vendors in this space charge by the gigabyte, so as you can imagine they have zero incentive to improve on this. Even for efficient binary logs such as the Windows performance counters, I noticed that second-to-second they're very highly redundant. I once experimented with a metric monitor that could collect 10,000-15,000 metrics per server per second and use only about 100MB of storage per host... per year. The trick was to simply binary-diff the collected metrics with some light "alignment" so that groups of related metrics would be at the same offsets. Almost all numbers become zero, and compress very well. |