|
|
|
|
|
by maldeh
1885 days ago
|
|
Yeah, given that the article started off establishing how an Estimator was basically an interface with simple rules about supporting "fit" and "predict" and how it could contain anything or do anything, I thought the argument laid out here would be about how these derivative implementations broke these rules. The rest of the article instead seems to have lost the plot though, somehow finding fault with various derivative or concrete implementations of this interface, for A) being inextensible implementations and not transitive interfaces themselves, as though "be anything do anything" no longer applied; or B) not being perfectly aligned with sklearn estimator details that the author didn't really identify as essential, like not following some sklearn-specific parameter naming rule or not being serializable via pickle (like seriously, pickle support is often not appropriate for production, why should this be a required pattern! It's not even a requirement of the interface unless you read between the lines like the author implies is essential to be at parity.) As other commenters outlined here, it assumes that sklearn's contract is absolute, as though other libraries couldn't reinterpret the core principles. The arguments against Tensorflow or Sagemaker's interfaces especially stretch quite a bit - what exactly is so offensive about these implementations given the very rules that the author establishes in this article? All "fit" is supposed to do is update internal state as the author asserts, but what precludes implementations of this interface from using cloud-based compute resources to achieve this end? And what about the fact that a docker container is deployed to the cloud by this command makes "fit" a lie? And honestly, what does the author have in mind for an estimator implementation that uses cloud resources like GCP TPUs or AWS EC2 that is also somehow more correct or pure than these implementations? More than anything, the author's dismissal of the value that GCP and AWS's implementations bring in eliminating infrastructure management via their Estimator implementations (equating it to "simply" writing Dockerfiles or running Docker containers on the cloud like there's no setup involved) implies that they're thoroughly disconnected from the realities of ML devops on the cloud. They're free to run their purist single-core sklearn estimators on their laptops as much as they'd like though (unless Dask somehow gets a pass from these arbitrary rules around how estimators can and cannot be used). |
|
FWIW, I deal with "the realities of ML devops on the cloud" nearly every day -- both at work and in hobby projects.
The comments here make me think I failed to get my main intent across in the post. I actually agree with many of the concrete claims you make here, but they have little to do with the arguments I saw myself as making.
The miscommunication was apparently so complete that if I tried to dig into specific points you make here, I'd end up effectively re-writing my post all over again. It was kind of exhausting to write the first time, so I'd prefer not to.
That said, as an example of something I didn't mean to say: I definitely don't think the sklearn API ought to be standard across ML, certainly not the pickling part! It is a well-designed API that's just right for its own limited context, and ought to inspire others to develop other APIs that are similarly well-designed for their contexts.
I only added the comment about Keras and pickle because I had quoted a tweet that literally said Keras was sklearn API compatible, and felt sort of obligated to point out that this was strictly false. Insofar as this relates to the larger point at all, it does so only as evidence that people like Chollet don't have a deep understanding of the thing they say they're inspired by.