| Agreed. I think this effort is completely missing the real pain points that ML suffers from. While python the language is easy, and in many ways great for its original purpose as a teaching language, I'll take note of the few ways that Python ML suffers: - pip hell. Really, having globally installed dependencies was great for the 90s and is terrible now that disk space is more or less a non-issue relative to dependencies. Venv/conda which do sneaky things e.g. with your shell is super dangerous (https://twitter.com/garybernhardt/status/1653171980483575808), and a misstep can trash your system especially when it has to deal with wheels with system-level dependencies (looking at you, tensorflow -- probably half of the reason why people moved to pytorch). Poetry sounds nice. It's been a while since I've checked in with the python ecosystem. Are ML people using that yet? - Subpar deployment. Let's remember that Containerization basically exists because Python does not have an ops story. - Subpar integration with web. You are forced to either create a microservice, or, spin it up within Django (nobody really does this). Then you typically have to pull in a bunch of sidecar processes (Redis, Celery, etc.) just to get queuing of your web jobs correct. - Poor concurrency. Sure, you can run your tensorflow code in an awkward 'with' statement but I think there are very few ML practicioners who could really explain to you what that with is doing. That GPU is actually fundamentally an asynchronous entity. And god help you if you want to run and debug async python. - No distribution story. Sure, the big guys are able to spin up, e.g. Horovod, but it's not really a thing for someone with less resources for a hot second on a few machines, and again, god help you if something goes wrong and you need to debug it. Does Mojo solve any of these issues? From a cursory look, it looks like no. |
So rather than writing another Python wrapper over c++ they are making a new performant language that can call Python.
To me it makes sense as torch is great and hard to compete with, but everything feeding into it is a mess today (Data loading, distribution logic).