Hacker News new | ask | show | jobs
by bfirsh 2037 days ago
We talked to a bunch of MLflow users, and the general impression we got is that it is heavyweight and hard to set up. MLflow is an all-encompassing "ML platform". Which is fine if you need that, but we're trying to just do one thing well. (Imagine if Git called itself a "software platform".)

In terms of features, Replicate points directly at an S3 bucket (so you don't have to run a server and Postgres DB), it saves your training code (for reproducibility and to commit to Git after the fact), and it has a nice API for reading and analyzing your experiments in a notebook.

1 comments

Congrats on the launch!

>MLflow is an all-encompassing "ML platform"

Not really. We're trying to use MLflow with our "ML platform"[0]. Namely, it can save a model that expects high dimensional inputs, which is most models I've seen that aren't trivial, and can "deploy" the model but with an expectation of two dimensional DataFrame inputs. Apparently, they're working on that.

There are also many ambiguities concerning Keras and Tensorflow stemming from "What is a Keras model? Is it a Tensorflow model now they're integrated? Why are Keras models logged with the tensorflow model logger when you use the autolog functionality?". These are shared ambiguities, as there are several ways to save and load models with Tensorflow, and we're looking into the Keras/Tensorflow integration closely. MLflow uses `cloudpickle` and unpickling expects not only the same 'protocol', but the same Python version. Had to dig deeper than necessary.

One other problem is when a model relies on ancillary functions, which you must be able to ship somehow. You end up tinkering with its guts, too.

Could you shed some light on how do you deal with these matters. Namely, high dimensional inputs for models, pre-processing/post-processing functions, serialization brittleness, and Keras/Tensorflow "duality".

We have to inherit that complexity to spare our users from having to mentally think of saving their experiments (we do that automatically to save models, metrics, params). The workflow is data --> collaborative notebooks with scheduling features and job --> (generate appbooks) --> automatically tracked models/params/metrics --> one click deployment --> 'REST' API or form to invoke model.

Aaaaaand again, congrats on the launch!

- [0]: https://iko.ai