|
> Like our hypothetical user who first appeared in 2012 and last posted in 2016 - right now they appear in the 2016 red line but if they showed up again today and you made the graph again next year, they wouldn't be in the 2016 red line anymore That is correct. > What happens if you cut off the data at 2022, 2021, 2020, 2019, 2018, etc and plotted those graphs? You'd see a different (rather than merely truncated) graph, no? Maybe even a different trend. So if my understanding is right, this is a pretty wiggly metric. The history of something you want to use as a historical trend line should not change as you append more data. I see your point, but I don't see how it is avoidable. From my knowledge, any user churn metric will suffer the same effect: If you consider a user is churned after two weeks of inactivity, then this will change if you change the cut-off (the last two weeks of the this month? the two weeks before them? ...etc). Even if you measure the "elabsed time" instead of "last seen", the cut-off will change your curve. Extreme example: If you assume a user is churned after 1 year of inactivity (elapsed time since last activitiy), then a user that shared one story in 2007, and then a second story in end of 2023, will apear as active. If you change the cut-off from 2023 to 2022, then the user will appear as inactive. |
You can define a metric such that future data doesn't affect past data. Here's a straightforward one: a user is inactive at time t if they haven't posted in the period between t and t - k where k some constant time period one picks. So let's say k is a year and you're looking at active users per year†. So in your last example, the user would be counted as active in 2007 and 2008, counted inactive in 2009 to 2022 and would count as active in 2023. If you truncate the data at 2022 nothing changes.
† year is probably too big of a window for this (I'd take something like a month) but let's stick with it for now