Hacker News new | ask | show | jobs
by huac 3813 days ago
Anomaly detection is very different across different domains. For CC fraud / risk, you have discrete transactions so the problem is one of classification, and generally approached with supervised learning.

I don't know what you mean by combining multiple variables. Do you mean analysis methods that work with multiple variables (instead of 1-dimensional z-scores) or do you mean methods that combine multiple variables into 1, to reduce input dimensions (i.e. principal component analysis)

Because data and data reporting platforms are so different across companies, there's no 'simple framework' to do reporting. You probably want something like https://github.com/etsy/skyline.

You also describe an ensemble method for outlier detection, which is what Skyline uses. I want to note that there is no reason to consider ensembles "not" outlier detection.

1 comments

I meant using multiple variables to categorize outlier events. What was shown here are also techniques to categorize discrete events ("Does this day cross some threshold?"). I guessed supervised learning methods.