Hacker News new | ask | show | jobs
by daniel_grady 861 days ago
What are some of the reasons that teams use conda (and related tools) today? As a machine learning scientist, I used conda exclusively in the mid-2010s because it was the only framework that could reliably manage Python libraries like NumPy, PyTorch, and so on, that have complex binary dependencies. Today, though, pip install works fine for those packages. What am I missing?
4 comments

For me personally, I prefer conda because it is dependency resolution (mamba), virtual environments, and a package repository (conda-forge) all from one base miniconda installation. And for all of my use cases, all of those just work. Dependency solving used to be painfully slow, mamba solved that. Packages used to be way behind the latest, setting conda-forge as my default solved that.

After fiddling with different solutions for years and having to start fresh with a new Python install, I've been using nothing by miniconda for years and it just works

Unfortunately, far too often: tradition.

Using only „Pythons native tools“ like pip and venv simply works nowadays so good that I wonder about the purpose of many tools like poetry etc. etc.

Has anyone else been paying attention to how hilariously hard it is to package PyTorch in poetry?

https://github.com/python-poetry/poetry/issues/6409

For me it's the easiest and fastest cross-platform way to consistently install a Python version.

pip and venv work fine, but you have to get them first; and that can be a struggle for unseasoned python devs, especially if you need a version that's not what your distro ships, and even more so on Windows and macOS.

I use micromamba [1] specifically, which is a single binary.

[1] https://mamba.readthedocs.io/en/latest/user_guide/micromamba...

Maybe it's because I came into Python later, but I've almost never had the problem of pip not being installed. That's what ensurepip is for, right?
Another reason I used to use conda was for easy native Windows installation. GPU accelerated packages like OpenCV were especially difficult when I used use it 6 years ago. Now there’s Linux subsystem.. has pip support dramatically improved on Windows?
The biggest advantage for poetry I found, working with a lot of non-traditional software people, is that it does a lot of things by default like pin versions and manage virtual envs. Unfortunately, it does complicate some things.
I can understand that well. A few articles from ByteCode! helped me to "follow my intuition" and do as much as possible with native Python tools.

https://www.bitecode.dev/p/back-to-basics-with-pip-and-venv

https://www.bitecode.dev/p/relieving-your-python-packaging-p...

Those are interesting pointers; appreciate it! My own experience over the past three years has been similar. I tried using Pipenv, and then Poetry, for internal projects at my company; in both cases the tool seemed overly complicated for the problem, slow, and I had a hard time getting co-workers on board. About a year and a half ago, I saw [Boring Python: dependency management](https://www.b-list.org/weblog/2022/may/13/boring-python-depe...), which recommends using the third-party `pip-tools` library alongside the standard library’s `pip` and `venv`, and switched to that for the next project. It’s been working great. The project has involved a small team of scientists (four or five, depending) who use a mix of macOS and Windows. We do analysis and development locally and write production-facing algorithms in Python packages tracked in our repository, and publish releases to Gitlab’s PyPI. For our team, the “get up and running” instructions are “clone, create a venv, and pip install -r requirements.txt” and for the software team that manages the production systems, deploying an update just means pip installing a new version of the package. Every team’s got different constraints, of course, but this has been working very smoothly for us for over a year now, and it’s been easy, no pushback, with everyone understanding what’s going on. Really impressed with the progress of the core Python packaging infrastructure over the past several years.
> Today, though, pip install works fine for those packages.

pip install works, but pip's dependency management doesn't seem to (for Pytorch, specifically) which is why projects that have pip + requirements.txt as one of their installation methods will often have separate pytorch installation instructions when using that method, though if the same project supports conda installation it will be a one-stop-shop installation that way.

> pip's dependency management doesn't seem to (for Pytorch, specifically)

That’s interesting — I’ve also had difficulties with PyTorch and dependency resolution, but only on the most recent versions of Python, for some period of time after they’re released. Picking Python 3.9 as a baseline for a project, for example, has been very reliable for PyTorch and all the related tooling.

One reason to choose one over the other is the dependencies they’re bundled with. Take numpy. With PyPI, it’s bundled with OpenBLAS, and with conda, it’s bundled with Intel MKL, which can be faster. See https://numpy.org/install/#
That’s a great point; I didn’t know about that!