|
I work in a ML ecosystem ATM and concurrency is a major problem in python: - Threads can't be used efficiently because of the GIL
- multiprocesses has to serialize everything in a single thread often killing performance. (Unless you use shared memory space techniques, but that's less than ideal compared to threads)
- You can't use multiprocess while inside a multiprocess executor. This makes building things on top of frameworks/libs that use multiprocess a nightmare... e.g try to use a web server like over something like Keras...
Those are the top reasons I don't like python but if you got appetite for more: - The dependency ecosystem is a pita, between python versions, package versions pinned or unpinned, requirements.txt, pipenv, poetry, conda... pick one and you're still sure to get into issues with other tools needing one system of another, or packages working a bit differently in conda etc... (I use poetry, with conda or pyenv)
- The culture of let's write code easily is good to start with but it becomes a problem as people especially maybe in DS don't go further then that... and you end up with bad practices all over the place, un-testable code (the tests systems are also a pain to navigate), copy & pasted blobs etc... Reading the code of some major libraries doesn't inspire confidence, especially compared to like Java, C++, go...
And last note I've seen way better emacs setup for python and presentations, it's ok as it is but I would not call it a Jimi Hendrix of python like a comment said...I wish ML/DS would switch to Julia |
For example: "When training a [type] model with X characteristics, the GIL causes Y, which makes it impossible to do Z".
We're building our machine learning platform[0] to solve problems we have faced shipping ML products to enterprise, and are interested in your problems as well.
For example, we've faced the environment/dependencies/"runs on my machine" problems and have addressed these with Docker images. Our users can spin up a notebook server with near real-time collaboration to work with others, and no setup because the environment is there.
The same with training jobs: they can click on a button and schedule a long-running notebook that runs against a specific environment to avoid "just yesterday I had X accuracy on my machine". The runs are tracked, the models, parameters, and metrics are automatically tracked because if we rely on a notebook author to do it, they might forget or have to context switch and it's an added cognitive load.
Some problems we faced were during deployment, too, where a "data scientist" writes a notebook to train a model and then we had to deploy that model reading their notebook or looking into dependencies. Now they can click on a button and deploy whichever model they want. It really was hindering us because they were asking someone else's help, who may have been working on something else.
- [0]: https://iko.ai