|
|
|
|
|
by Wookai
1703 days ago
|
|
Google employee here, xmanger is one of the main ML experiment tracking/orchestration tool we use internally, I'm pretty excited that it is now available for other to use! In a nutshell, xmanager allows you to: - define an experiment, which is a collection of one or more work units (think combination of hyperparamters) - manage the different jobs/executable required to run this experiment (TPU workers, tensorboard job, etc.) - collect and display measurements from work units (loss, other metrics) - keep a reproducible artifact which allows you to re-run the same experiment at any point in the future See e.g. https://github.com/deepmind/xmanager/blob/main/examples/ for a few concrete examples of a launcher scripts. I wish they had included screenshots of the tool itself in the repo, I'll make that suggestion :). |
|