| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by usasha 2688 days ago
	Just ran thru official tutorial and I'm pretty impressed. As I understood the idea is to: - use git branch for each experiment (change of hyperparameters etc.) - define pipeline stages (preprocessing, train/test split, model training, model validation) - after this steps you can change any part of pipeline (say data preprocessing or model parameters) and run `dvc repro` to reproduce all stages for which dependancies changed and track metrics for all branches, which os pretty cool and reduce experiment logs in wiki

1 comments

dmpetrov 2688 days ago

Exactly. And it is flexible as Git. You can define your own workflow. For example, some data scientists avoid using git-branches for experiments - they use directories.