| HN Mirror

I could go on at length about why MLFlow / Databricks understanding of ML projects is bad to a bonkers degree. I’ll give just one example, which has mattered considerably for several production projects my team works on and tried to manage in ML Flow for a while.

The project was a suite of neural network models that provided face & object detection results in a low-latency web interface where customers can manipulate photos and want automated metadata about people or objects.

In our case, to optimize for performance we need to frequently experiment with compile-time details of the runtime environment (in our case a container) where the application will run in production.

So the axis of our experiments wasnot usually anything to do with neural network layers or data or parameters. It was different compiler optimization flags, different precision approximations and GPU settings that needed to be rolled into a huge number of different underlying runtime environments, and then for each distinct runtime environment the more mundane experiments would be carried out for layer topology, number of neurons, width of CNN filters, etc.

We found that unless youbasically build your own entire “meta” version of ML Flow that wraps around ML Flow, then it falls apart at use cases where custom compile time details of the runtime are themselves aspects of the experiment. Not to mention that the Projects formatting violates good practices, like 12 Factor stuff, for how to inject settings from the environment, which again leads to wasted effort making special case deployment handling for ML Flow jobs.

Whatever deploys and measures your tasks should not also impose any type of special case packaging structure, which is a big reason why MLFlow conceptually fails. Any attempt to make anything at all like a DSL packaging layer for experiments that causes it to diverge from “regular deployment of any old job” is immediately a failed idea. The only thing it’s good for is creating unwitting vendor lock-in once you’re highly dependent on this bespoke, weird packaging template for Projects that makes your ML jobs weirdly (and needlessly) different from other deployment tasks.