Hacker News new | ask | show | jobs
by modin 1982 days ago
I've implemented something very similar at work, this was a nice write-up. Biggest differences we're doing is to use Welford's algorithm[0] to calculate a running variance, so we can calculate anomalies in real time, without the need to store logs. It works quite well.

[0]: https://en.m.wikipedia.org/wiki/Algorithms_for_calculating_v...

1 comments

Same, the difference I had in my implementation was an added an 8 week moving average. This worked for most of the year and I think even holidays. What was considered normal for the week of Thanksgiving would not be considered normal for a summer week. This was relatively easy using SQL CTEs and windowing functions.

I also separately used a package in R called bsts (Bayesian Structural Time Series) for a different way of projecting seasonality on a trend to find an acceptable normal range. If the actual fell out of range then it was an anomaly. Great write up on the technique by Kim Larsen here: https://multithreaded.stitchfix.com/blog/2016/04/21/forget-a...