Hacker News new | ask | show | jobs
by usasha 2688 days ago
Just ran thru official tutorial and I'm pretty impressed.

As I understood the idea is to: - use git branch for each experiment (change of hyperparameters etc.) - define pipeline stages (preprocessing, train/test split, model training, model validation) - after this steps you can change any part of pipeline (say data preprocessing or model parameters) and run `dvc repro` to reproduce all stages for which dependancies changed and track metrics for all branches, which os pretty cool and reduce experiment logs in wiki

1 comments

Exactly. And it is flexible as Git. You can define your own workflow. For example, some data scientists avoid using git-branches for experiments - they use directories.