Hacker News new | ask | show | jobs
by pfranz 1678 days ago
So you're suggesting always using virtualenv?

I used to just use pip to install to the system. Months/years later I would try to untangle the mess of packages I was just playing with, what the OS wanted/needed, I got those conflicting dependencies you mention, etc. I usually ended up reinstalling the OS. At the time I may not have been as knowledgeable about where the OS package manager keeps packages vs pip--but the whole thing wasn't very user-friendly either.

For years I've been installing into user knowing I can just blow it away. I've dabbled with virtualenv, but it's such a pain to set up and activate. If I have a few projects with similar libraries it's more of a pain to set them all up and switch around. If I end up using a script for something important, I just spend the extra time at that point to "package" it.

2 comments

This is one of the reasons people use Anaconda/miniconda for non-data science work: conda environments are self-contained Python installs, so if you conda/pip install packages into those environments, they will not break each other. This design requirement arose from the specific needs of numerical computing (which always drags in a ton of system-level C/C++/FORTRAN dependencies), but is a generically useful design construct.

Anaconda is a distro, and conda is a package manager, that works across OS platforms and hardware architectures, and installs cleanly into userland without requiring admin privileges. The only way we achieve this difficult goal is by creating a distro and build system that creates "portable" packages that can be relocated/relinked at install-time.

Ultimately, Python's challenges in this department come from the fact that it has such great integration with low-level C/C++ libraries. This gives it super powers as duct tape/glue language, but it also drags it down into the packaging tech debt of C/C++. Hmm... maybe I should write that blog post: "Python Packaging Isn't The Problem; C/C++ Is." :-)

I was slow to get to grips with venv. It sounds like you are on the same path. This note tries to be constructive advice -

* Some distro software uses python. Let the package manager take care of dependencies for that.

* For everything else, use an dedicated virtualenv for each codebase you are working with.

   > I used to just use pip to install to the system
Never do this, for the reasons you cite.

   > I've dabbled with virtualenv, but it's such a pain to set up and activate
Setup for virtualenv: "python3 -B -m venv venv". Have a shell alias 'alias v=". venv/bin/activate"' that allows you to activate it if you need to install libraries or access a shell. "pip install blah" for library install. That should be all you need.

   > If I have a few projects with similar libraries
   > it's more of a pain to set them all up and switch around
Have a think about why you feel this way, and whether you could mitigate the problems.

Here is what I do. Once my libraries are installed for the current project, I rarely activate venv in the current shell. Rather, for each python project, I have a bash script "app" in the root of the project, and a dedicated "venv" directory.

The app script does the following: (1) sources the local venv; (2) does pip freeze > requirements.txt to capture any dependency changes; (3) launches the project. Often I will have multiple launchers in that script, with all of them commented out except for the active one. Be in a habit of always launching from that app script.

To reiterate the approach above, whenever you sit down to write some python code, ensure that you have a dedicated venv for it, and that you are only ever launching code from that local venv.

I have spoken to developers who get upset at the extra hard disk overhead. You don't need to optimise for hard disk usage. Hard disk space is almost free.

I don't bother creating setup.py files, except for the odd occasion that I want to publish code to pip. Good luck.

Thanks for taking the time to share!

That's sounds like the general approach I take for "projects" even toy projects. My day jobs have never fit the virtualenv use-case. So at home I often have to look up how to use it. It's so rare that when I make an alias I even forget those.

Most new things are one-off scripts; move or rename some files, extract data from something, or pull from a resource. Something that requires libraries or is too big for a shell script. For example, the last one I see in my bin is a web scraper for appointments. It pulls a website, fills out a form, and gets the result a few times--about 70 lines. What's annoying is sourcing some environment just to run this one tool.

Most people have a directory of scripts (a mix of shell, Perl, or Python) they use if they spend a lot of time at a commandline. It's quite a pain to source the environment just to run a quick script. That's generally the libraries I install into user. I don't care much about the version and troubleshoot things as they come up.