|
|
|
|
|
by skorgu
1704 days ago
|
|
I'm curious how this can both avoid the average-of-averages problem (presumably by using the original full-rate data to compute multiple aggregates) and also supports backfilling. Is there a danger of the full-rate data expiring and having a different behavior for backfills past that horizon? Or am I wholly misunderstading both these features? |
|
Great question. We support average of averages by storing the intermediate state of the aggregate (for average that's the sum and count) so we could cleanly re-aggregate.
Eventually, we'll be able to incrementally update the aggregate if we backfill even if the raw data is no longer available. That's not implemented yet though, so backfill only updates the aggregate if the raw data is still around by re-computing the intermediate state of the aggregate off the raw data for affected buckets. For most cases that isn't actually an issue since most people have a longer data retention period than backfill horizon.