Hacker News new | ask | show | jobs
by caconym_ 229 days ago
There is nothing I dread more within the general context of software development, broadly, than trying to run other people's Python projects. Nothing. It's shocking that it has been so bad for so long.
17 comments

Never underestimate cultural momentum I guess. NBA players shot long 2 pointers for decades before people realized 3 > 2. Doctors refused to wash their hands before doing procedures. There’s so many things that seem obvious in retrospect but took a long time to become accepted
Hey and you can use both lanes in a zip merge!
Isn't that the law anyway?

Morale: follow the rules.

>NBA players shot long 2 pointers for decades before people realized 3 > 2

And the game is worse for it :')

This is a fundamental problem in sports. Baseball is going the same way. Players are incentivized to win, and the league is incentivized to entertain. Turns out these incentives are not aligned.
> Players are incentivized to win, and the league is incentivized to entertain.

Players are incentivized to win due to specific decisions made by the league.

In Bananaball the league says, "practice your choreographed dance number before batting practice." And those same athletes are like, "Wait, which choreographed dance number? The seventh inning stretch, the grand finale, or the one we do in the infield when the guy on stilts is pitching?"

Edit: the grand finale dance number I saw is both teams dancing together. That should be noted.

Sure. There's a market for that. But the NBA sells a lot more tickets than the Harlem Globetrotters.
But that's a matter of scale. When I was a child, the Harlem Globetrotters were far more more famous than any 3-4 NBA teams combined. They were in multiple Scooby Doo movies/episodes. They failed tp scale the model, but wrestling didn't.
Would be very curious about, say, the worst MLB team's ticket sales vs. the Savannah Bananas.
This isn't right - the league can change the rules. NFL has done a wonderful job over the years on this.

Baseball has done a terrible job, but at least seems to have turned the corner with the pitch clock. Maybe they'll move the mound back a couple feet, make the ball 5.5oz, reduce the field by a player and then we'll get more entertainment and the players can still try their hardest to win.

I wonder if anyone has made an engine for simulating MLB play with various rule changes.

Personally, I think it'd be interesting to see how the game plays if you could only have two outfielders (but you could shift however you choose.)

It's a good thought.

I'd guess MLB The Show video game wouldn't be a bad place to start. They should have a decent simulator built in.

And the ongoing gambling scandal gives credence to a third incentive I've long suspected. Only half joking
Something Derek Thompson has written about https://archive.ph/uSgNd
Is it ? I, for one, enjoy watching the 3s raining down!
They did wash their hands. Turns out that soap and water wasn't quite enough. Lister used carbolic acid (for dressing and wound cleaning) and Semmelweis used chlorinated lime (for hand washing).
And Semmelweis is a perfect case against being an asshole who's right: He was more right than wrong (he didn't fully understand why what he was doing helped, but it did) but he was such a horrible personality and such an amazing gift for pissing people off it probably cost lives by delaying the uptake of his ideas.

But this is getting a bit off topic, I suppose.

Or you could say it the other way around: Even leading scientists are susceptible to letting emotions get the best of them and double-down defending their personal investments into things.

"A scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die and a new generation grows up that is familiar with it." - Max Planck.

Was soap often used prior to the mid 1800s?
That was later; earlier in history doctors (or "doctors" if you so insist) did not wash their hands.
I was mainly pushing back on the idea that something as seemingly obvious as hand washing was the thing that made surgery safe. It took quite a bit more than just simple hand washing.
People paid 100x more for their hosting when using aws cloud until they realized they never neded 99.97% uptime for their t-shirt business. Oh wait too soon. Save for post for the future.
People paid only 100x more than self hosting to use AWS until they realized that they could get a better deal by paying 200x for a service that is a wrapper over AWS but they never have to think about since it turns out that for most businesses that 100x is like 30 bucks a month.
People spent half their job figuring out self hosted infrastructure until they realized they rather just have some other company deploy their website when they make a commit.
kubernetes
So many times I have come onto a library or tool that would fix my problem, and then realized “oh crap, it’s in Python, I don’t want to spend few hours building a brittle environment for it only for that env to break next time I need to use it” - and went to look for a worse solution in better language.
I really don't get this. I can count on no hands the number of times I've had problems simply going "pip install cool-thing-i-found".

Sure, this is just my experience, but I use Python a lot and use a lot of tools written in Python.

If you can install it with `pip install program-name` it's usually packaged well enough to just work. But if it's a random github repository with a requirements.txt with no or very few version numbers chances are that just running `pip install -r requirements.txt` will lead you down an hour+ rabbit hole of downgrading both your venv's python version and various packages until you get a combination that is close enough to the author`s venv to actually work

Usually happens to me when I find code for some research paper. Even something that's just three months old can be a real pain to get running

I don't disagree with you, but in my experience even having a requirements.txt file is a luxury when it comes to scientific Python code: A lot of the time I end up having to figure out dependencies based purely on whatever the script is importing
If they can't be bothered to make a requirements.txt file, I'm not seeing how uv will be of much help...
uv basically makes that a default. You don’t need to be bothered. Just uv add your dependencies and they are in your pyproject.toml.
Ah, I get it now! The problem occurs when someone publishes something without version pinning, because package versions can become incompatible over time. I don't think I've ever installed something outside of what's available on PyPy, which is probably why I've never run into this issue.

Still, I would think it's rare that package versions of different packages become incompatible?

Seconded. Python, even with virtualenv stuff, has never been bad. There have been a few things that have been annoying especially when you need system libraries (e.g. libav for PyAV to work, etc.), but you have the same issue with every other ecosystem unless the packages come with all batteries included.

To be fair to the GP comment, this is how I feel about Ruby software. I am not nearly as practiced at installing and upgrading in that ecosystem so if there was a way to install tools in a way that lets me easily and completely blow them away, I would be happier to use them.

I still have nightmares about nokogiri gem installs from back in the day :/
Shudder. I'm guessing it was the always breaking libxml2 compilation step right?
This mentality is exactly what many people do wrong in Python. I mean, for a one-off, yes you can have setup instructions like that. But if you want things to work for other people, on other machines, you better include a lock file with checksums. And `pip install whatever` simply does not cut it there.
Except I'm saying my experience is the opposite of the problem you purport. I (as the consumer) have always done "pip install whatever", and have never run into issues.

One of the commentors above explained what the problem really is (basically devs doing "pip install whatever" for their dependencies, instead of managing them properly). That's more a problem of bad development practices though, no?

Nah. That's a dumb UI. If you do cargo add whatever, it does something completely different from cargo install whatever, and there's no way to inadvertently use the wrong one. If pip install whatever leaves your project in a state that may be unusable for other people, but usable for you, that's just design that invites confusion and works-on-my-machine syndrome.
Recently (like for several years), with most packages providing wheels for most platforms, it tends to be less of a problem of things actually working, except for dependencies where the platform specifiers used by Python are insufficient to select the right build of the dependency, like PyTorch.
I know, this is just how it is I guess . Those of us mystified what the big problem is with virtualenv and pip and why we all have to use a tool distributed by a for profit company and it's not even written in python will just have to start a little club or something

I guess this is mostly about data science code and maybe people who publish software in those communities are just doing very poor packaging, so this idea of a "lock file" that freezes absolutely everything with zero chance for any kind of variation is useful. Certainly the worst packaged code I've ever seen with very brittle links to certain python versions and all that is typically some ML sort of thing, so yeah.

This is all anathema to those of us who know how to package and publish software.

Recently I've been playing with Chatterbox and the setup is a nightmare. It specifically wants Python 3.11. You have 3.12? TS. Try to do pip install and you'll get an error about pkg-config calling a function that no longer exists, or something like that.

God, I hate Python. Why is it so hard to not break code?

I experienced that recently - just curious, since you're digging into voice synth, what are open-source voice synth (specifically text-to-speech) which have been working for you. Recently, I have tried PiperTTS (I found the voices very flat, and accented), Coqui (in the past - it wasn't great, and doesn't seem to be supported). I spent a ton of time trying to get Chatterbox to work (on Debian Linux 13) - and ultimately couldn't get the right mix of Python versions, libraries etc. At this moment, I'm using AWS Polly and ElevenLabs (and occasionally MacOS `say`), but would love to have an open-source TTS which feels quality, and I can psychologically invest in. Thanks for any perspective you can share.
>I spent a ton of time trying to get Chatterbox to work (on Debian Linux 13)

Exactly my case. I had to move back to Debian from Ubuntu, where I had installed Chatterbox without much difficulty, and it was hell. You pretty much need Anaconda. With it, it's a cinch.

>what are open-source voice synth which have been working for you.

I tried a few, although rather superficially. Keeping in mind that my 3090 is on my main (Windows) machine, I was constrained to what I could get running on it without too much hassle. Considering that:

* I tried Parler for a bit, although I became disillusioned when I learned all models have an output length limit, rather than doing something internally to split the input into chunks. What little I tried with it sounded pretty good if it stayed within the 30-second window, otherwise it became increasingly (and interestingly) garbled.

* Higgs was good. I gave it one of Senator Armstrong's lines and made it generate the "mother of all omelettes" one, and it was believable-ish; not as emphatic but pretty good. But it was rather too big and slow and required too much faffing around with the generation settings.

* Chatterbox is what I finally settled with for my application, which is making audiobooks for myself to listen to during my walks and bike rides. It fits in the 3070 I have on the Linux machine and it runs pretty quick, at ~2.7 seconds of audio per second.

These are my notes after many hours of listening to Chatterbox:

* The breathing and pauses sound quite natural, and generally speaking, even with all the flaws I'm about to list, it's pleasing to listen to, provided you have a good sample speaker.

* It you go over the 40-second limit, it handles it somewhat more graciously than Parler (IMO). Instead of generating garbage it just cuts off abruptly. In my experience splitting text at 300-350 characters works fairly well, and keeping paragraphs intact where possible generates best results.

* If the input isn't perfectly punctuated it will guess at the sentence structure to read it with the correct cadence and intonation, but some things can still trip it up. I have one particular text where the writer used commas in many places where a period should have gone, and it just cannot figure out the sentence structure like that.

* The model usually tries to guess emotion from the text content, but it mostly gets it wrong.

* It correctly reads quoted dialogue in the middle of narration, by speaking slightly louder. If the text indicates a woman is speaking the model tries to affect a high pitch, with varying degrees of appropriateness in the given context. Honestly, it'd be better if it kept a consistent pitch. And, perplexingly, no matter how much the surrounding text talks about music, it will read "bass" as "bass", instead of "base".

* Quite often the model inserts weird noises at the beginning and end of a clip which will throw you off until you learn to ignore them. It's worse for short fragments, like chapter titles and the like. Very rarely it inserts what are basically cut-off screams, like imagine a professional voice actor is doing a recording and just before he hit stop someone was murdered inside the booth.

* It basically cannot handle numbers more than two digits long. Even simple stuff like "3:00 AM" it will read as complete nonsense like "threenhundred am".

* It also has problems with words in all caps. It's a tossup if it's going to spell it out, yell it, or something in between. In my particular case, I tried all sorts of things to get it to say "A-unit" (as in a unit with the 'A' designation) properly, but sometimes it still manages to fuck it up and go "ah, ah, ah, ah, ah, ah unit".

* Sometimes it will try to guess the accent it should use based on the grammar. For example, I used a sample from a Lovecraft audiobook, with a British speaker, and the output will sometimes turn Scottish out of nowhere, quite jarringly, if the input uses "ya" for "you" and such.

Thank you - this is helpful. I didn't realize how important I was going to value consistency over quality voice, but then when you've got to go back and listen to everything for quality control ... I guess that is the drawback of this phase of "generative" voice synth.
> pip install cool-thing-i-found

This is the entire problem. You gonna put that in a lock file or just tell your colleagues to run the same command?

I meant I'm running that command as the consumer, and have never had problems. When I make my own packages, I ensure that anyone doing the same thing for my package won't have issues by using version pinning.
Having packages in a package manager is the problem?
like democracy, it's the worst programming language except vs everything else...
This comment is pithy, but I reject the sentiment.

In 2025, the overall developer experience is much better in (1) Rust compared to C++, and (2) Java/DotNet(C#) compared to Python.

I'm talking about type systems/memory safety, IDEs (incl. debuggers & compilers), package management, etc.

Recently, I came back to Python from Java (for a job). Once you take the drug of a virtual machine (Java/DotNet), it is hard to go back to native binaries.

Last, for anyone unfamiliar with this quote, the original is from Winston Churchill:

    Many forms of Government have been tried, and will be tried in this world of sin and woe. No one pretends that democracy is perfect or all-wise. Indeed it has been said that democracy is the worst form of Government except for all those other forms that have been tried from time to time.
How come it's easier if the tool is in another language? What are the technical (or cultural) reasons? Do most C programs use static linking, or just not have deps?
When I need to build an established project written [mostly] in C or C++, even if I don't have the dependencies installed, it's typically just a matter of installing my distro's packages for the deps and then running configure and make, or whatever. It usually works for me. Python almost never does until I've torn half my hair out wrapping my brain around whatever new band-aid bullshit they've come up with since last time, still not having understood it fully, and muddled through to a working build via ugly shortcuts I'm sure are suboptimal at best.

I don't really know why this is, at a high level, and I don't care. All I know is that Python is, for me, with the kinds of things I tend to need to build, the absolute fucking worst. I hope uv gets adopted and drives real change.

My last dance with Python was trying to build Ardupilot, which is not written in Python but does have a build that requires a tool written in Python, for whatever reason. I think I was on my Mac, and I couldn't get this tool from Homebrew. Okay, I'll install it with Pip—but now Pip is showing me this error I've never seen before about "externally managed environments", a concept I have no knowledge of. Okay, I'll try a venv—but even with the venv activated, the Ardupilot makefile can't find the tool in its path. Okay, more googling, I'll try Pipx, as recommended broadly by the internet—I don't remember what was wrong with this approach (probably because whatever pipx does is totally incomprehensible to me) but it didn't work either. Okay, what else? I can do the thing everybody is telling me not to do, passing `--break-system-packages` to plain old Pip. Okay, now the fucking version of the tool is wrong. Back it out and install the right version. Now it's working, but at what cost?

This kind of thing always happens, even if I'm on Linux, which is where I more usually build stuff. I see errors nobody has ever posted about before in the entire history of the internet, according to Google. I run into incomprehensible changes to the already incomprehensible constellation of Python tooling, made for incomprehensible reasons, and by incomprehensible I mean I just don't care about any of it, I don't have time to care, and I shouldn't have to care. Because no other language or build system forces me to care as much, and as consistently, as Python does. And then I don't care again for 6 months, a year, 2 years, until I need to do another Python thing, and whatever I remember by then isn't exactly obsolete but it's still somehow totally fucking useless.

The universe has taught me through experience that this is what Python is, uniquely. I would welcome it teaching me otherwise.

I agree with you wholeheartedly, besides not preferring dynamic programming languages, I would in the past have given python more of a look because of its low barrier to entry...but I have been repulsed by how horrific the development ux story has been and how incredibly painful it is to then distribute the code in a portable ish way.

UV is making me give python a chance for the first time since 2015s renpy project I did for fun.

That's because many people don't pay attention to reproducibility of their developed software. If there is no lock file in a repo that nails the exact versions and checksums, then I already know it's likely gonna be a pain. That's shoddy work of course, but that doesn't stop people from not paying attention to reproducibility.

One could argue, that this is one difference between npm and such, and what many people use in the Python ecosystem. npm and cargo and so on are automatically creating lock files. Even people, who don't understand why that is important, might commit them to their repositories, while in the Python ecosystem people who don't understand it, think that committing a requirements.txt only (without checksums) is OK.

However, it is wrong, to claim, that in the Python ecosystem we didn't have the tools to do it right. We did have them, and that well before uv. It took a more care though, which is apparently too much for many people already.

The lock file shouldn't be in the repository. That forces the developers into maintenance that's more properly the responsibility of the CI/CD pipeline. Instead, the lock file should be published with the other build artifacts—the sdist and wheel(s) in Python's case. And it should be optional so that people who know what they're doing can risk breaking things by installing newer versions of locked dependencies should the need arise.
It absolutely should be. Otherwise you don’t have reproducible builds.
You can reproduce the release just fine using the lock file published alongside the release. Checking it in creates unnecessary work for devs, who should only be specifying version constraints when absolutely necessary.
> Checking it in creates unnecessary work for devs, who should only be specifying version constraints when absolutely necessary.

The unnecessary work of a `git commit`?

Having the file be versioned creates no requirement to update its contents any more frequently than before, and it streamlines "publishing alongside the release". The presence of the lockfile in the repo doesn't in any way compel devs to use the lockfile.

You aren’t kidding. Especially if it’s some bioinformatics software that is just hanging out there on GitHub older than a year…
Do you think bioinformatics libs written in C++ do not have the same issues?
They’re weren’t that many that weren’t pre compiled for Linux in the c++ world. Python is bad, but others have issues too.

C/C++ often had to compile used “make” which I’ll admit to being better at the conda/pip.

I suspect this is because the c/c++ code was developed by people with a more comp Sci background. Configure/make/make install..I remember compiling this one.

https://mafft.cbrc.jp/alignment/software/source.html

If the software made it biogrids life was easier

https://biogrids.org/

But a lot of the languages had their own quirks and challenges (Perl cpan, Java…). Containerization kinda helps.

I mean, I think this is par for the course by anything written by a grad student. Be thankful it's not written in matlab
The only thing I dreaded more was trying to run other people's C++ projects.
vcpkg seems to help a lot there, at least for Windows code and msbuild/Visual Studio.
Which means you’re already generally in worse shape than Python. At least Python’s half baked packaging systems try to be multi-platform.
vcpkg is also multi-platform (Linux, macOS). I just haven't used it for any of those yet.
I was into Python enough that I put it into my username but this is also my experience. I have had quasi-nightmares about just the bog of installing a Python project.
I used to think this sentiment was exaggerated. Then I tried installing Dots OCR. What a nightmare, especially when NVIDIA drivers are involved.
Same! And Python was my first, and is currently my second-highest-skill language. If someone's software's installation involves Python, I move on without trying. It used to be that it would require a Python 2 interpreter.

Honorable mention: Compiling someone else's C code. Come on; C compiles to a binary; don't make the user compile.

There's a lot more involved in distributing C (and C++) programs than just compiling them:

I'm assuming a Linux based system here, but consider the case where you have external dependencies. If you don't want to require that the user installs those, then you gotta bundle then or link them statically, which is its own can of worms.

Not to mention that a user with an older glibc may not be able to run your executable, even if they have your dependencies installed. Which you can, for example, solve by building against musl or a similar glibc alternative. But in the case of musl, the cost is a significant overhead if your program does a lot of allocations, due to it lacking many of the optimizations found in glibc's malloc. Mitigating that is yet another can of worms.

There's a reason why tools like Snap, AppImage, Docker, and many more exist, each of which are their own can of worms

Yea def. I think Linux's ABI diaspora and the way it handles dependencies is pain, and the root behind both those distro methods you mention, and why software is distributed as source instead of binaries. I contrast this with Rust. (And I know you can do this with C and C++, but it's not the norm:

  - Distribute a single binary (Or zip with with a Readme, license etc) for Windows
  - Distribute a single binary (or zip etc) for each broad Linux distro; you can cover the majority with 2 or 3. Make sure to compile on an older system (Or WSL edition), as you generally get forward compatibility, but not backwards.
  - If someone's running a Linux distro other than what you built, they can `cargo build --release`, and it will *just work*.
Another nice thing is that, if you can live with the slower musl malloc, then building a "universal" Linux binary with Cargo takes just two commands:

$ rustup target add x86_64-unknown-linux-musl

$ cargo build --target x86_64-unknown-linux-musl --release

Similarly for cross-compiling for Windows

It may be fixed now, but devil's in the details. As one example, musl has (or had) chronic issues with it's dns resolver and large responses.
Definitely. I haven't tried building anything that requires DNS using musl, but I've had to work around musl's much, much slower malloc implementation

The musl wiki lists a number of differences between it and glibc that can have an impact:

https://wiki.musl-libc.org/functional-differences-from-glibc...

I should try that!
> C compiles to a binary; don't make the user compile.

C compiles to many different binaries depending on the target architecture. The software author doesn't necessarily have the resources to cross-compile for your system.

Incidentally, this is probably exactly the thing that has made most of those Python installations problematic for you. Because when everything is available as a pre-built wheel, very much less can go wrong. But commonly, Python packages depend on included C code for performance reasons. (Pre-built results are still possible that Just Work for most people. For example, very few people nowadays will be unable to install Numpy from a wheel, even though it depends on C and Fortran.)

> Honorable mention: Compiling someone else's C code. Come on; C compiles to a binary; don't make the user compile.

Unless you’re on a different architecture, then having the source code is much more useful.

Or often just the same architecture with a slightly different OS version.
The python community was in profound denial for a very long time.
I dread running my own Python projects if I haven't worked with them in a while.
Couldn't agree more. I have a project at work from 2016 that builds multiple different HMIs (C++) along with 2 embedded systems (C). They all have to play nicely with each other as they share some structures and can all be updated in the field with a single file on a USB stick. So there is a bash script that builds everything from a fresh clone, makes update files, and some other niceties. Then, there is a single python script that generates a handful of tables from a json file.

Guess which part of the build I spent fixing the other day... It wasn't the ~200000 lines of c/c++ or the 1000+ line bash script. No. It was 100 lines of python that was last touched 2 years years ago. Python really doesn't work as a scripting language.

How about shipping one? Like even just shipping some tools to internal users is a pain
I really don't understand this. I find it really easy.
Just stick to what's in your linux distribution and you've got no problems.
No need, run python as a container. No need to mix what's installed on the hostOS.

https://hub.docker.com/_/python

this manages to be even worse. since it's setup full of holes to usable (eg reaching out on the filesystem), you get the worst of random binaries without isolation, plus the dead end for updates you get in practice when dealing with hundreds of containers outside of a professionally managed cluster.
Actually, you get better isolation and resource restrictions due to cgroups v2, no mixture with host packages, and the full library stack ships with the application. When the application container is updated, so are the associated packages.
Not even trying to compile build other people's C/C++ projects on *nix?
pfff... "other people projects".. I was not even able to run my own projects until I started using Conda.