|
|
|
|
|
by jamesblonde
2002 days ago
|
|
We address the problem of adding provenance without rewriting your tensorflow/scikit-learn/pytorch/pyspark application by adding CDC support in the ML stack and collecting all events in a metadata layer, building an implicit provenance graph. It's now part of the open-source Hopsworks platform. See this USENIX OpML'20 talk on it: https://www.youtube.com/watch?v=PAzEyeWItH4 |
|
edit: I should add that I'm definitely in favour of having provenance in ML systems, and libraries layered on top are the way that people currently do that. It's just odd that people aren't working on adding that support directly into scikit-learn/TF/pytorch etc.