| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gidim 2769 days ago
	Runs and experiments are not 1:1 mapped. A single container run could generate multiple experiments such as with the case of parameter search. Additionaly traditional tools for version control are not well suited for ML results and exploration. That said code is still a big piece of the puzzle. Our approach at Comet.ml is to snapshot everything whether it runs on a container or not and tie that back to git.

1 comments

mlthoughts2018 2769 days ago

This is still perfectly synonymous with regular build tools, like running a rebuild in Jenkins or ‘build with parameters.’ The point is to treat builds and runs of an experiment setup exactly the same, with the same tooling, monitoring, data capturing, etc., as any other deployed program. There is nothing special about a one-off job that trains a model or computes an experimental result compared with jobs that perform an experiment on database tuning or test load on a web service or any other type of deployed job. You have monitoring and probing of key stats and health of the experiment, you can reproduce the exact run or the same run with modified parameters, and the run produces output artifacts or writes data. It’s all perfectly the same.

Basically if someone shows me a supposed ML experiment tracking system, the first question is, “If I replace the phrase ‘ML experiment’ with ‘generic computing task’, does the tool still handle everything exactly the same?”

If not, it’s a failed idea, because you’re trying to break model training or tuning jobs out of the regular deployment model and you’re not using consistent tooling to manage deployment of experiment runs and all other types of “jobs” that you can “run.”

link

gidim 2769 days ago

Sure you can reuse tools to achieve similar results. As with everything else the devil is in the details. Does your monitoring system saves results forever or it only let you report 90 days back? Can you compare two runs in a meaningful way? i.e not just logs but also interactively plotting exploring your results? Do you need to spend hours to instrument your code? Can you sort jenkins job by a parameter/metric? What about reporting new results to an existing experiment? There's many more examples. But in any case if you can reuse your CI/CD system for ML experiment management you should do that. Another question worth considering is that if this is a "failed idea" why would engineering led tech companies build these systems? Obviously they tried reusing their current tooling.

The tools we've been building for the past fifty years were designed for software engineering. Machine learning workflows are different in many ways and as such require new tools and approachs. That's at least our perspective.

link

mlthoughts2018 2768 days ago

Literally all the example cases you mention are also needed when comparing results for database tuning, load balancing, A/B testing, etc. etc. None of those asks would differentiate ML projects from any other type of general project. So unless you plan to shoe-horn non-ML projects into an upstart system purportedly for ML projects, you’re just wasting resources (usually egregiously) by using a different tool. Even just thinking ML problems are different somehow is usually already a sign that you’re investing in ML in a way that is very unlikely to map to project success.

link