Hacker News new | ask | show | jobs
Show HN: Graphsignal – Production Model Monitoring: Anomalies, Drift, Exceptions (github.com)
30 points by dmitrim 1712 days ago
2 comments

It would be possible to build a similar system via a library my team has built: https://github.com/capitalone/dataprofiler

Effectively, you can monitor changes between profiles:

data1 = dp.Data("file_a.csv") # Load a CSV file

profile1 = dp.Profiler(data1) # Generate a profile

data2 = dp.Data("file_b.csv") # Load another CSV file

profile2 = dp.Profiler(data2) # Generate another profile

diff_report = profile1.diff(profile2)

print(json.dumps(diff_report, indent=4))

The system we have generates reports, it might be worth adding it OP.

What does this have to do with model monitoring?
You can pass the output of the model to the profiling system to monitor if things are drifting.

It's also possible to monitor the input data and link back.

There's quite a few ways to do this, but effectively you can monitor drift by identifying which inputs have the greatest impact in accuracy. Then tying that back to predict the drift over time.

Wow! A great idea (haven't look into the code yet). With the new EU AI regulations coming in 2023/4. Every company with ML in production will need to be able to monitor these issues. Potential for a very good open core business model.