| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by samatman 831 days ago
	There's no need to pretend Python has virtues which it lacks. It's not a fast language. It's fast enough for many purposes, sure, but it isn't fast, and this work is unlikely to change that. Faster, sure, and that's great.

5 comments

rmbyrro 831 days ago

Although true, it doesn't mean they can't improve its performance.

Working with threads is a pain in Python. If you want to spawn +10-20 threads in a process, it can quickly become way slower than running a single thread.

Removing the GIL and refactoring some of the core will unlock levels of concurrency that are currently not feasible with Python. And that's a great deal, in my opinion. Well worth the trouble they're going through.

bb88 831 days ago

Working with threads is a pain regardless of which language you use.

Some might say: "Use Go!" Alas: https://songlh.github.io/paper/go-study.pdf

After a couple decades of coding, I can say that threading is better if it's tightly controlled, limited to usages of tight parallelism of an algorithm.

Where it doesn't work is in a generic worker pool where you need to put mutex locks around everything -- and then prod randomly deadlocks in ways the developer boxes can't recreate.

jcranmer 831 days ago

> After a couple decades of coding, I can say that threading is better if it's tightly controlled, limited to usages of tight parallelism of an algorithm.

This may be a case of violent agreement, but there are a few clear cases where multithreading is easily viable. The best case is some sort of parallel-for construct, even if you include parallel reductions, although there may need to be some smarts around how to do the reduction (e.g., different methods for reduce-within-thread versus reduce-across-thread). You can extend this to heterogeneous parallel computations, a general, structured fork-join form of concurrency. But in both cases, you essentially have to forbid inter-thread communication between the fork and the join parameters. There's another case you might be able to make work, where you have a thread act as an internal server that runs all requests to completion before attempting to take on more work.

What the paper you link to is pointing out, in short, is that message passing doesn't necessarily free you from the burden of shared-mutable-state-is-bad concurrency. The underlying problem is largely that communication between different threads (or even tasks within a thread) can only safely occur at a limited number of safe slots, and any communication outside of that is risky, be it an atomic RMW access, a mutex lock, or waiting on a message in a channel.

bmitc 831 days ago

> Working with threads is a pain regardless of which language you use.

That's not true at all. F#, Elixir, Erlang, LabVIEW, and several other languages make it very easy. Python makes it incredibly tough.

amethyst 831 days ago

> Python makes it incredibly tough.

I disagree, Python makes it incredibly easy to work with threads in many different ways. It just doesn't make threads faster.

bmitc 831 days ago

In what way? Threading, asyncio, tasks, event loops, multiprocessing, etc. are all complicated and interact poorly if at all. In other languages, these are effectively the same thing, lighter weight, and actually use multicore.

If I launch 50 threads with run away while loops in Python, it takes minutes to laumch and barely works after. I can run hundreds of thousands and even millions of runaway processes in Elixir/Erlang that launch very fast and processes keep chugging along just fine.

bb88 829 days ago

> If I launch 50 threads with run away while loops in Python, it takes minutes to laumch and barely works after. I can run hundreds of thousands and even millions of runaway processes in Elixir/Erlang that launch very fast and processes keep chugging along just fine.

I'm not sure that argument helps your position on threading. I once saw a java program spin off 3000 threads doing god knows what. Debugging the fucking thing was impossible.

rmbyrro 831 days ago

The whole purpose of threads is to improve overall speed of execution. Unless you're working with a very small number of threads (single digits), that's a very hard to achieve goal in Python. I wouldn't count this as easy to use. It's easy to program, yes, but not easy to get working with reasonably acceptable performance.

bb88 830 days ago

And the python people would just point to multiprocessing...which works pretty well.

rmbyrro 831 days ago

It's not such a big pain in every language. And certainly not as hard to get working with acceptable performance in many languages.

Even if you have zero shared resources, zero mutexes, no communication whatsoever between threads, it's a huge pain in Python if you need +10-ish threads going. And many times the GIL is the bottleneck.

Shog9 830 days ago

This is where Python's GIL bit me: I was more than familiar with how to shoot myself in the foot using threads in other languages, and careful to avoid those traps. Threads spun up only in situations where they had their own work to do and well-defined conditions for how both failure and success would be reported back to the thread that requested it, along with a pool that wouldn't exceed available resources.

Like every other language I've used this approach with, nothing bad happened - the program ran as expected and produced correct results. Unlike every other language, spreading calculations across multiple cores didn't appreciably improve performance. In some cases, it got slower.

Eventually scrapped it all, and went with an approach closer to what I'd have done with C and fork() decades ago... Which, to Python's credit, was fairly painless and worked well. But it caught me off-guard, because with asyncio for IO-bound stuff, it didn't seem like threads really have much of a purpose in Python, other than to be a tripwire for unwary and overconfident folks like myself!

bb88 830 days ago

Not disagreeing. The only case for threading in python is for spinning something to handle IO.

But now with async even that goes away.

heinrich5991 831 days ago

Concurrency with rayon in Rust isn't pain, I'd say. It's basically hidden away from the user.

KaiserPro 831 days ago

> If you want to spawn +10-20 threads in a process, it can quickly become way slower than running a single thread.

as you know thats mostly threads in general. Any optimisation has a drawback so you need to choose wisely.

I once made a horror of a thing that synced S3 with another S3, but not quite object store. I needed to move millions of files, but on the S3 like store every metadata operation took 3 seconds.

So I started with async (pro tip: its never a good idea to use async. its basically gotos with two dimensions of surprise: 1 when the function returns, 2 when you get an exception ) I then moved to threads, which got a tiny bit extra performance, but much easier debugability. Then I moved to multiprocess pools of threads (fuck yeah super fast) but then I started hitting network IO limits.

So then I busted out to airflow like system with operators spawning 10 processes with 500 threads.

it wasnt very memory efficient, but it moved many thousands of files a second.

scubbo 831 days ago

This is entirely fair, and I wish I'd been a little less grumpy in my initial reply (I assign some blame to just getting over an illness). Thank you for the gentle correction!

That said - I think it's fair to be irritated by people who write Python off as entirely useless because it is not _the fastest_ language. As you rightly say - it's fast enough for many purposes. It does bother me to see Python immediately counted out of discussions because of its speed when the app in question is extremely insensitive to speed.

Affric 830 days ago

It’s all about values.

I have been on teams where Python based approaches were discounted due to “speed” and “industry best practice” and then had the very same engineers create programs that are slow by design in a “fast” language and introduce needless complexity (and bugs) through “faster” database processes.

Like you said, it’s the thoughtless criticism. The meme. I am happy for Python to lose in a design analysis because it’s too slow for what we are building; I am loathe to let it lose because whoever is doing the analysis with me has heard it’s slow.

Which is to say, I get what you’re saying. I think people have been a little ungenerous with your comment.

scubbo 830 days ago

> I think people have been a little ungenerous with your comment.

Eh - I engaged with a fraught topic in a snarky way without clarifying that I meant the unintuitive-but-technically-literally-accurate interpretation of my words. Maybe some people have been less-generous than they could have been, but I don't begrudge it - if I look sufficiently like a troll, I won't complain when I get treated like one. Not everyone has the time and mental fortitude to treat everyone online with infinite patience and kindness - I know I sure don't.

Thank you for the support, though!

wongarsu 831 days ago

In some ways the weakness even was a virtue. Because Python threads are slow Python has incredible toolsets for multiprocess communication, task queues, job systems, etc.

fragmede 830 days ago

"Faster, sure" seems unnecessarily dismissive. That's the whole point of all this work.

nick238 831 days ago

Maybe it'll shut up "architects" who hack up a toy example in <new fast language hotness>, drop it on a team to add all the actual features, tests, deployment strategy, and maintain, and fly away to swoop and poop on someone else. Gee thanks for your insight; this API serves maybe 1 request a second, tops. Glad we optimized for SPEEEEEED of service over speed of development.

markhahn 831 days ago

You seem to be implying that there is something inherently slow to Python. What?

This topic is an example: a detail of one particular implementation, since GIL is definitely not inherent to the language. Just the usual worry about looseness of types?

doctorpangloss 831 days ago

There are worse hills to die on than this. But the Python ecosystem is very slow. It's a cultural thing.

The biggest impact would be completely redoing package discovery. Not in some straightforward sense of "what if PyPi showed you a Performance Measurement?" No, that's symptomatic of the same problem: harebrained and simplistic stuff for the masses.

But who's going to get rid of PyPi? Conda tried and it sucks, it doesn't change anything fundamental, they're too small and poor to matter.

Meta should run its own package index and focus on setuptools. This is a decision PyTorch has already taken, maybe the most exciting package in Python today, and for all the headaches that decision causes, look: torch "won," it is high performance Python with a vibrant high performance ecosystem.

These same problems exist in NPM too. It isn't an engineering or language problem. Poetry and Conda are not solutions, they're symptoms. There are already too many ideas. The ecosystem already has too much manic energy spread way too thinly.

Golang has "fixed" this problem as well as it could for non-commercial communities.

pphysch 831 days ago

The "Python ecosystem" includes packages like numpy, pytorch & derivatives which are responsible for a large chunk of HPC and research computing nowadays.

Or did you mean to say the "Python language"?

doctorpangloss 831 days ago

> The "Python ecosystem" includes packages like numpy, pytorch & derivatives which are responsible for a large chunk of HPC and research computing nowadays.

The "& derivatives" part is the problem! Torch does not have derivatives. It won. You just use it and its extensions, and you're done. That is what people use to do exciting stuff in Python.

It's the manic developers writing manic derivatives that make the Python ecosystem shitty. I mean I hate ragging on those guys, because they're really nice people who care a lot about X, but if only they could focus all their energy to work together! Python has like 20 ideas for accelerated computing. They all abruptly stopped mattering because of Torch. If the numba and numpy and scikit-learn and polars and pandas and... all those people, if they would focus on working on one package together, instead of reinventing the same thing over and over again - high level cross compilers or an HPC DSL or whatever, the ecosystem would be so much nicer and performance would be better.

This idea that it's a million little ideas incubating and flourishing, it's cheerful and aesthetically pleasing but it isn't the truth. CUDA has been around for a long time, and it was obviously the fastest per dollar & watt HPC approach throughout its whole lifetime, so most of those little flourishing ideas were DOA. They should have all focused on Torch from the beginning instead of getting caught up in little manic compiler projects. We have enough compilers and languages and DSLs. I don't want another DataFrame DSL!

I see this in new, influential Python projects made even now, in 2024. Library authors are always, constantly, reinventing the wheel because the development is driven by one person's manic energy more than anything else. Just go on GitHub and look how many packages are written by one person. GitHub & Git, PyPi are just not adequate ways to coordinate the energies of these manic developers on a single valuable task. They don't merge PRs, they stake out pleasing names on PyPi, and they complain relentlessly about other people's stuff. It's NIH syndrome on the 1m+ repository scale.

fragmede 831 days ago

yeah. like xkcd 927 to the nth degree.

sneed_chucker 831 days ago

CPython is slow. That's not really something you can dispute.

It is a non-optimizing bytecode interpreter and it makes no use of JIT compilation.

JavaScript with V8 or any other modern JIT JS engine runs circles around it.

Go, Java, and C# are an order of magnitude faster but they have type systems that make optimizing compilation much easier.

There's no language-inherent reason why Python can't be at least as fast as JavaScript.

mixmastamyk 831 days ago

I've read that it can't even be as fast as JS, because everything is monkey-patchable at runtime. Maybe they can optimize for that when it doesn't happen, but remains to be seen.

sneed_chucker 831 days ago

I've heard similar claims but I don't think it's true.

JavaScript is just as monkey-patchable. You can reassign class methods at runtime. You can even reassign an object's prototype.

Existing Python JIT runtimes and compilers are already pretty fast.

maple3142 830 days ago

Python is probably much more monkey patchable. Almost any monkey patching that JavaScript supports also works in Python (e.g. modifying class prototype = assigning class methods), but there are a few things that only Python can do: accessing local variables as dict, access other stack frames, modifying function bytecode, read/write closure variables, patching builtins can change how the language works (__import__, __build_class__). Many of them can make a language hard to optimize.

imtringued 830 days ago

You can always use optimistic optimization strategies where you profile the fast path and optimize that. When someone does something slow, you tell them to stop doing it if they want better performance.

cozzyd 830 days ago

JavaScript doesn't have to contend with a plethora of native extensions (which, to be fair, are generally a workaround for python slowness).

sneed_chucker 830 days ago

JavaScript, at least on the Node.JS side, make plenty use of native extensions written in C++ https://nodejs.org/api/addons.html

In any case, that should be irrelevant to getting a reasonably performant JIT running. Lots of AOT and JIT compiled languages have robust FFI functionality.

The native extensions are more relevant when we talk about removing the GIL, since lots of Python code may call into non thread safe C extension code.

oivey 831 days ago

Python is inherently slow. That’s why people tend to rewrite bits that need high performance in C/C++. Removing the GIL is a massively welcome change, but it isn’t going to make C extensions go away.