|
|
|
|
|
by dwhitena
3256 days ago
|
|
Great questions and discussions. I'm definitely passionate about versioning in the context of models and data science for both data and code. I work full time on the open source Pachyderm project (pachyderm.io), and we have users versioning their data and models in our system. Basically, you can output checkpoints, weights, etc. from your modeling and have that data versioned automatically in Pachyderm. Then if you utilize that persisted model in a data pipeline for inference, you can have total provenance over which versions of which models created which results (and which training data was used to create that version of the model, etc.). |
|