Wasn't CPython supposed to remain very simple in its codebase, with the heavy optimization left for other implementations to tackle? I seem to remember hearing as much a few years back.
The problem is that:
* CPython is slow, making extension modules written in C(++) very attractive
* The CPython extension API exposes many implementation details
* Making use of those implementation details helps those extension modules be even faster
This resulted in a situation where the ecosystem is locked-in to those implementation details: CPython can't change many aspects of its own implementation without breaking the ecosystem; and other implementations are forced to introduce complex and slow emulation layers if they want to be compatible with existing CPython extension modules.
The end result is that alternative implementations are not viable in practice, as most existing libraries don't work without their CPython extension modules -- users of alternative implementations are essentially stuck in their own tiny ecosystem and cannot make use of the large existing (C)Python ecosystem.
CPython at least is in a position where they can push a breaking change to the extension API and most libraries will be forced to adapt. But there's very little incentive for library authors to add separate code paths for other Python implementations, so I don't think other implementations can become viable until CPython cleans up their API.
That was the original idea, when Python started attracting interest from big corporations. It has however become clear that maintaining alternative implementations is very difficult and resource-intensive; and if you have to maintain compatibility with the wider ecosystem anyway (because that's what users want), you might as well work with upstream to find solutions that work for everyone.
To date, no Python implementation has managed to hit all three:
1. Stay compatible with any recent, modern CPython version
2. Maintain performance for general-purpose usage (it's fast enough without a warmup, and doesn't need to be heavily parallelized to see a performance benefit)
3. Stayed alive
Which, frankly, is kind of a shame. But the truth of the matter is that it was a high bar to hit in the first place, and even PyPy (which arguably had the biggest advantages: interest, mindshare, compatibility, meaningful wins) managed to barely crack a fraction of a percent of Python market share.
If you bet on other implementations being the source of performance wins, you're betting on something which essentially doesn't exist at this point.
Isn't PyPy up to 3.10 by now? At least that's what Homebrew reports to me.
PyPy seems pretty alive, all things considered, and for my code bases I've seen pretty dramatic speedups on the order of 2-5x. That's basically a no brainer unless I'm doing something with incompatible C extensions, which I think is the real Achilles heel of all of these alternative implementations.
PyPy has definitely had the most success of all other implementations, but it still has a painful warmup period for many workloads. I can't imagine it's an effective option for anyone to install as the default Python implementation on their laptop, for instance. And for many, many years, it had almost no modern Python support (but I'm of course very glad to see it's slowly catching up).
It is encouraging for PyPy to see some influx of money in recent years. But I will continue to patiently wait for it to hit enough of a sweet spot of performance vs usability vs compatibility to see real adoption.
Does Python even have a language specification? I've been told that CPython IS the specification. I don't know if this is still true. In the Java world there is a specification and a set of tests to test for conformation so it's easier to have alternative implementations of the JVM. If what I said is correct, then I can see how the optimized alternative implementation idea is less likely to happen.
Well, for Python the language reference in the docs[0] is the specification, and many things there are described as CPython implementation details. Like: "CPython implementation detail: For CPython, id(x) is the memory address where x is stored." And as another example, dicts remembering insertion order was CPython's implementation detail in 3.6, but from 3.7 it's part of the language.
There is a pretty detailed reference that distinguishes between cpython implementation details and language features at least. There was a jvm python implementation even. The problem is more that a lot of the libraries that everyone wants to use are very dependent on cpython's ffi which bleeds a lot of internals.
This resulted in a situation where the ecosystem is locked-in to those implementation details: CPython can't change many aspects of its own implementation without breaking the ecosystem; and other implementations are forced to introduce complex and slow emulation layers if they want to be compatible with existing CPython extension modules.
The end result is that alternative implementations are not viable in practice, as most existing libraries don't work without their CPython extension modules -- users of alternative implementations are essentially stuck in their own tiny ecosystem and cannot make use of the large existing (C)Python ecosystem.
CPython at least is in a position where they can push a breaking change to the extension API and most libraries will be forced to adapt. But there's very little incentive for library authors to add separate code paths for other Python implementations, so I don't think other implementations can become viable until CPython cleans up their API.