| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by citilife 1758 days ago

It would be possible to build a similar system via a library my team has built: https://github.com/capitalone/dataprofiler

Effectively, you can monitor changes between profiles:

data1 = dp.Data("file_a.csv") # Load a CSV file

profile1 = dp.Profiler(data1) # Generate a profile

data2 = dp.Data("file_b.csv") # Load another CSV file

profile2 = dp.Profiler(data2) # Generate another profile

diff_report = profile1.diff(profile2)

print(json.dumps(diff_report, indent=4))

The system we have generates reports, it might be worth adding it OP.

1 comments

manojlds 1758 days ago

What does this have to do with model monitoring?

link

citilife 1758 days ago

You can pass the output of the model to the profiling system to monitor if things are drifting.

It's also possible to monitor the input data and link back.

There's quite a few ways to do this, but effectively you can monitor drift by identifying which inputs have the greatest impact in accuracy. Then tying that back to predict the drift over time.

link