Hacker News new | ask | show | jobs
by gidim 2760 days ago
Sure you can reuse tools to achieve similar results. As with everything else the devil is in the details. Does your monitoring system saves results forever or it only let you report 90 days back? Can you compare two runs in a meaningful way? i.e not just logs but also interactively plotting exploring your results? Do you need to spend hours to instrument your code? Can you sort jenkins job by a parameter/metric? What about reporting new results to an existing experiment? There's many more examples. But in any case if you can reuse your CI/CD system for ML experiment management you should do that. Another question worth considering is that if this is a "failed idea" why would engineering led tech companies build these systems? Obviously they tried reusing their current tooling.

The tools we've been building for the past fifty years were designed for software engineering. Machine learning workflows are different in many ways and as such require new tools and approachs. That's at least our perspective.

1 comments

Literally all the example cases you mention are also needed when comparing results for database tuning, load balancing, A/B testing, etc. etc. None of those asks would differentiate ML projects from any other type of general project. So unless you plan to shoe-horn non-ML projects into an upstart system purportedly for ML projects, you’re just wasting resources (usually egregiously) by using a different tool. Even just thinking ML problems are different somehow is usually already a sign that you’re investing in ML in a way that is very unlikely to map to project success.