Hacker News new | ask | show | jobs
by IanOzsvald 1266 days ago
Can other data scientists comment?

I'm 15 years in with python and scientific work. For a lot of years I liked conda but then it got crazy slow. Next I started making conda environments and installing packages with pip. Now I'm experimenting with mamba ("fast conda") and that's pretty good.

Conda envs mean I can experiment with different versions of Python (I'm a co author for O'Reilly's High Performance Python so eg 3.11 and 3.12 are pretty interesting right now). Conda "should" also make identical teaching environments (I teach my own courses). Pip was a pragmatic choice to get installations in minutes not hours in the years when conda was silly-slow.

The above is also all for short -lived research work (my typical client mode for scientific work), so it is probably different to anyone doing long-run dev work, production deploys, or for those not needing non-Python binary support (eg GPU/C/Fortan lib support).

4 comments

> Now I'm experimenting with mamba ("fast conda") and that's pretty good.

+1 for this. I really like Conda because I use a lot of packages that cannot be installed with only pip (E.g. GDAL). Conda would sometimes take 10 minutes to make an environment.

Mamba has been an absolute game changer. It has rewritten parts of Conda in C++ and added multithreading, so every part of the process - the solving, download, and the extraction - is all Lightning quick in comparison. A 10 minute process with Conda can now take 2.5 mins with Mamba. It also helps that it is also prettier to look at!

I found that the best way to install GDAL was:

- on Windows, using Christoph Gohlke’s wheels,

- on Linux, installing GDAL with the OS’ package manager, then the Python bindings by fixing the version to $(gdal-config —-version) and setting C_INCLUDE_DIR and CPP_INCLUDE_DIR.

It’s a bit more involved, but the speed of pip and conda’s multiple channels and dependency resolution issues make this worth it.

Agreed. I recently saw on a Mamba team's blog post saying that conda is also starting to incorporating libmamba, so probably the issue could be resolved someday.
I am a Lisp programmer who has mostly used Python for deep learning. I found the long runtimes for some conda operations worth while in order to have several very different environments setup. I now try to use Google Colab for as much of my deep learning work as possible, so I have thought of dropping conda and try an alternative setup because I am writing a Python book that covers deep learning but also covers a lot of use cases that I usually use Common Lisp for. I will probably take the time to try most of the options in this blog article.
Mamba is great! You won't find many self-promotional blog posts about it because it just works.
Mamba doesn't work with optional dependencies (those in square braces e.g. pytorch[gpu])
> Conda "should" also make identical teaching environments (I teach my own courses).

It's probably easier for your students if you provide containers for them, then everyone has an identical environment.