| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by h1fra 795 days ago

I understand the point but I also advocate for the opposite, it's not cool for the planet for sure but having all the data points for at least a couple of months is very useful on any large system and +15months for metrics so you can compare with the year before.

I can't count the number of times users (or myself) discovered bug after many weeks because something gradually failed over time. Also it saves a lot of time to be able to pin point the exact day a behavior as changed so you can check the deploy of that day and quickly find the bug. Sometimes a trend is not obvious after a deploy but is clearly visible on the graph after a long period of time.

And for business intelligence, it's always when you badly need a metric that you realize you never tracked it.

1 comments

jcgrillo 795 days ago

Yeah I've definitely been saved a bunch of times by long retention, and the BI questions that might arise are impossible to predict. So some sort of retention is definitely necessary, IME.

But let's take the case of metrics as an example--do we need full sample granularity for "old" data? Do we need full tag cardinality? Sample granularity reduction could be done with a transform to rollups at a coarser time granularity. That's a 60x reduction going from Hz to 1/min. You might lose a bunch of frequency information this way, but maybe that's ok?

Numbers are really nice in ways that text is not.