Hacker News new | ask | show | jobs
by sateesh 35 days ago
In my effort to better understand Deeplearning I built this project (https://github.com/sateeshkumarb/anomaly_detection) to detect anomalies from a batch of loglines. I use 1D CNN and Siamese network (Triplet loss) to train the model to learn anomaly patterns from logs. The goal was to detect anomalies that emerge across multiple lines (e.g., error bursts) rather than just single-line keywords.

To validate the approach I trained the model on generating synthetic data. I did look at datasets available at: https://github.com/logpai/loghub, https://www.unb.ca/cic/datasets/index.html but couldn't find one that would suit my needs.

The approach seems to work on synthetic dataset (with ROC AUC score: 0.9957) but couldn't try it out in a real world dataset. Seeking feedback on the approach.