We've been impacted by this. I migrated our services to Python 3.14 so we could attach profilers during runtime.
A couple of services looked like they had a memory leak. Memory was continuously increasing over time. Thanks to Python 3.14, we were able to use memray to understand what was going on. Those services were recreating HTTP clients (aiohttp) for every inbound request, and memory allocated by the downstream SSL lib was growing faster than it was being released.
We ended up rolling back to 3.13, which fixed the issue. I'll try again with 3.14.5.
If you are using "httpx", it's likely caused by a reference cycle. I made a PR to fix it but the maintainers haven't applied it. :-( https://github.com/encode/httpx/pull/3733
The reference cycle httpx creates is kind of a worst-case scenario for the incremental GC issue. Both the generational (3.13 and older) and incremental GC are triggered by the net new "container" objects (objects that have references to others, like lists and not like ints and floats). The short summary is that you need to create more container objects before the incremental GC triggers. In the case of the httpx reference cycle, you have a relatively small number of container objects hanging on to a lot of memory, due the SSL context data (which is a big memory hog).
Reverting back to the generational GC was the wise thing to do, even though it's a bit scary to do in a bugfix release. The incremental GC works for most people but in the minority of cases it doesn't, it uses quite a lot more memory. I'm pretty sure with some additional tuning, the incremental GC would be fine too but it just didn't get that tuning. The generational GC has literal decades of real-world use (Guido merged my patch on Jun 2000, Tim Peters did a bunch of tuning after that to optimize it).
Unfortunately, you may be the wrong gender to contribute to Encode repositories like httpx:
> I've closed off access to issues and discussions.
> I don't want to continue allowing an online environment with such an absurdly skewed gender representation. I find it intensely unwelcoming, and it's not reflective of the type of working environments I value.
We've been chasing down similar aiohttp client creation issues (liked to ...aiobotocore usage) for months now.
It's annoying that somehow talking to S3 etc requires so much churn. We have been trying to cache session objects and the like but clearly are still missing something.
Chasing this down has also made me realize how little Python libs use `weakref`, and just will build up so many circular references. The other day I figured out Django request's session infrastructure creates a circular reference meaning that requests have to get GC'd to get cleaned up in CPython.
I have a suspicion that the 3.14 problems are heavily linked to "real" workloads being almost entirely filled with cyclical objects.
It's really fascinating to read this, since I've encountered similar memory issues in other languages (ruby, go, etc.). Debugging these issues is a pain.
Is there a way to make all this much easier to debug and to prevent memory issues in the first place? Is the abstraction level not quite right?
So with CPython's reference counting, if you're good at not building strong cycles, you really can avoid garbage pressure. It's not even that complicated, it's mostly a question of making a weak reference _somewhere_ along the chain. Often the ergonomics are not great, but Python @property's are nice here.
So for example
class Request
class Session
request.session exists, and the session is "part" of the request. but session.request often exists as a facility. That's a reference cycle which prevents the request (and anything it's pointed at!) from being deallocated at the end of a request.
But in this case, you could easily do something like:
session._request = weakref.ref(request) # on session creation
and then have session.request call session._request() (and maybe assert session._request() is not None if you want to be certain). If you're confident that the session is a "child" of the request, and that you would _never_ have a hold of the session after the request is done, this is a cheap trick that makes session.request cost a little bit more but not much.
I think most Python libraries just don't do memory perf analyses here, and also "believe" in the garbage collector. When GC runs, both request and session will get deallocated, after all! But the long term effects of everyone relying on the GC are that GC is expensive when it doesn't need to be, and when looking through memory you just have more stuff to dig through
On profilers - profiling will come in 3.15, are you referring to remote exec? It is a great feature I am very exited about, at the same time afraid that the company won’t allow ptrace capability in prod.
yes. remote exec allows me to attach profilers (e.g. memray) directly into a running process. i'm also excited about the upcoming statistical (cpu) profiler from 3.15
"Python 3.14 shipped with a new incremental garbage collector. However, we’ve had a number of reports of significant memory pressure in production environments.
We’ve decided to revert it in both 3.14 and 3.15, and go back to the generational GC from 3.13."
The main benefit of python to me is that while slow, it's predictable. I do think they're going to get a lot more resistance to adding JITs, moving GCs, etc. it will become java with a million knobs to tune. If people want a JIT'd python just use pypy, right?
Java lost almost all those knobs a while ago (I mean they're there, but you're better off relying on the defaults). The modern GCs have one or at most two knobs remaining, and even that will become unnecessary next year. As to predictablity, you get maximal pause time of well under 1ms for heaps up to 16TB.
> Compared to Python's, all of them are beyond perfect.
I somehow understand the situation less after reading this.
Is Python's GC bad, or are there cyclic reference issues? Is it possible to detect cyclic references perfectly? What does beyond perfect mean? If we have 7 and 0.1% of the time you need one of the 6 that is non-default, how do we choose? Is the understated version of "Compared to Python's, all of them are beyond perfect" "I think Java's are great"? If not, what about Python's impl makes it so lackluster to any of 7 of Java's?
> Lately, they seems to work with CRIU, various heuristics, multi-stage in-process bytecode compilation ..
Not sure what you mean by this, as this has nothing to do with GC, and Java has had a multi-tier optimising compiler for 15 years now.
> that nobody else have, so fixes are available
Go has much worse problems with GC than Java does these days, and nobody else is able to achieve similar performance in large programs with heavy workloads. So everyone else lives with less sophisticated compilers and memory management simply by accepting worse performance.
Libraries. I use both languages, and a survey of what libraries are available is part of picking an implementation language when starting a greenfield project.
$ time ./a.out
real 0m0.002s
user 0m0.000s
sys 0m0.002s
A do-nothing Go program:
$ time ./tmp
real 0m0.002s
user 0m0.000s
sys 0m0.003s
I don't believe Go has any optimizations to not start its runtime if it isn't necessary, but when I added spawning a goroutine that immediately blocks on a channel read that will never come the numbers didn't change. That doesn't really time the runtime. Probably the program terminated before the goroutine was scheduled to run anything. It just makes it so there definitely wasn't an early exit because the compiler or the runtime "realized" it didn't need to start the runtime.
I'm sure the Go program is somewhat slower to start and end than C, and that we're running into the limits of how quickly processes can be spawned and other timing overhead which is obscuring the difference. However for practical purposes, "it starts up in less than the overhead for starting a process in the shell" is the same speed for most purposes.
Not even a "do nothing" Python program, no Python program at all:
$ time python3 -c 1
real 0m0.012s
user 0m0.008s
sys 0m0.004s
If you had a Go program that was slow to start up, it was your program, not Go. By contrast, Python, and the dynamic scripting languages in general, can be quite slow to start up, just in the reading and compiling of the code. (Even .pyc files, IIRC, take processing, just less processing than Python source code... it's still nowhere near "memory map it in and go" as it is for statically-compiled languages.)
What? Compared to Python they're like lightning. Typically milliseconds to the start of main() - admittedly they can be slowed down by init() nonsense and terrible generated protobuf code nonsense in deep dependency trees - but with a non-trivial Python program you can look forward to an order of magnitude more. There are techniques to help address that but (1) they're not idiomatic and (2) it still only mitigates it.
I suppose Go programs are slower than the equivalent thing in C or C++, but I'm not sure that's a very relevant comparison in most cases today (how many new things being written would choose those languages).
PyPy is not looking healthy right now - it's several versions behind in support and, while it's not dead, it looks like it might be settling down for a rest.
Obviously it's not easy to move the whole language of a big codebase, but I feel a lot of this stuff (fiddling with GC, JITing, type hints, and I'm dubious about the free-threading stuff) tries to take Python somewhere it isn't really good at, and if that's what you want, you really want a different language.
Well, they never made the jump to Python 3. But shipping 2.7 interpreters in 2024 was quite an achievement on its own. So their users already know this pain. And from my experience in academia, python 2.7 and java 8 will probably be used for another 20 years before the last machine running that stuff burns out.
Why are people still building systems on top of a language that continually undergoes fundamental changes nearly 40 years after release? Is this not the strongest indication that this language is not well designed, it is unstable, and encounters many issues that flat out don't exist in other high level languages?
What language that is actually used 40 years after release isn't undergoing big, fundamental changes?
Java? Nope, you're getting a fundamental change in Valhalla
C++? Nope, new language edition every few years with fundamental changes
C? C23 has a number of fairly fundamental changes, expect more in the next language revision
I think your sense of causality is backwards here. These languages are getting fundamental changes because they're being widely used. That is what motivates and drives the change. Languages with no users don't need to change.
As you say, any widely used language gets fundamental changes from time to time.
But most such languages handle much better the compatibility with legacy applications.
Python is the main culprit in most cases when I see conflicts between various software packages that insist to use only a specific version of their dependencies. This is why I have to keep installed many versions of Python, and the Linux distribution that I use must take care to prevent interference between those Python versions.
That's fine, but that's clearly not what I'm talking about.
Languages like F#, Elixir, etc. don't undergo fundamental changes. Yes, every language evolves. But for Python, we're talking about grafting literally fundamental stuff on top of a language not designed for any of these things.
For example, if someone went and redesigned Python to solve its warts, you'd basically end up with F#.
I'm currently in a .NET shop so not an issue for me, makes me wonder if Python will eventually adopt the concept of LTS releases, this could have been avoided as an issue if it was part of a non-LTS release.
If all releases are LTS, then none are. Part of the point GP was making is that when some releases have a very short maintenance window, then changes that are terrible in them don't need to be reverted (since the maintenance window will close soon anyways).
Yeah it seems like a miss. I guess the thinking was that it wasn't developer-facing and just an internal optimization. But of course any change to garbage collection will change the memory and cpu dynamics of the process in a material way.
.NET seems to have regularly changed the garbage collector over the years and I do not remember any similar surprises in production. I wonder why they have had better experience?
I thought that by now dynamic garbage collection was a known quantity so that making changes, outside of out right bugs, is fairly safe and predictable?
One thing Microsoft does really well is eating its own dogfood and Microsoft feeds a ton of .Net dogs.
So any change to GC starts with massive .Net MSFT code base so they get extremely good telemetry back about any downsides and might be able to fix it in time.
There is almost no dog fooding on Windows development since version 8, Typescript team rather rewrite the compiler in Go, Azure has plenty of Go, Rust and Java projects alongside .NET.
Oh, they really don't dogfood Windows development any longer, regardless of the incentives.
I have my WinRT 8, UAP 8.1, UWP 10, Project Reunion, .NET Native, C++/CX, C++/WinRT, XAML Islands, XAML Direct, WinUI 2.0, WinUi 3.0, WinAppSDK and what not scars to prove how they aren't dog fooding any piece of it in any meaningful manner.
Heck they keep talking about C++ support in WinUI 3, as if the team hasn't left the project and is now playing with Rust instead.
They managed that plenty of early WinRT advocates became their hardest critics, while not believing anything else they put out, like now this Windows K2 project.
I like my programming language flame wars just as much as the next guy but Go is a really easy language to get started with, while also being very fast. It's not just luck
What? If you are talking web development, .Net is just about the same as Go. It's 100% Java OOP type writing but result is same, very performant API server.
Sure, Rust is completely different beast with different target system.
Actually there’s a change to dotnet 9 with how it handles the heap and GC which caused major issues for us.
I’ll confess the reason it hit us so hard is because the code quality was so low and wasteful on allocations that it didn’t hide the problem as well as previous versions.
I remember working on the Windows Update back-end at Microsoft around 2005, and we had a problem where it would freeze up periodically, and not surprisingly that turned out to be caused by GC. But we noticed it before shipping, and we just tweaked some GC parameters.
So I think it was not a big problem for .Net because it gave you enough control over GC, and because people tested their code before putting it in production.
The problem is that you create an HTTP client for each incoming request. In other words, you recreate SSL context and cause reference cycles with each request. From the memory point of view, the program seems to have a memory leak. However, the solution to the issue lies at the application level: just create your client once and reuse it for each subsequent request.
One of heuristics to find memory leaks can be stated as follows: if you instantiate any HTTP or connection objects inside the request handler, it's likely that you've made a mistake.
All these issues were known in previous attempts for removing the GIL. But if Instagram/Meta want it, everyone stands to attention and finds out the obvious problems years later. Kind of like in geopolitics.
I hope Meta switches Instagram to PHP/Hack so they leave Python alone.
In the world of AI written code, Python just doesn’t make sense. Converted about 100k lines in the last few months to golang and the performance is life changing. Curious if we will see global Python adoption fall by 75% or more in the next few years.
With a similar amount of experience with both languages I found Go much easier to read. I've always been a bit miffed why Python is seen as easy to read for experienced developers. I get the syntax is good for short code or people with little experience but my experience is those readability benefits went away quickly with time or complexity.
Why are you miffed about it? I legitimately hate reading golang with passion and find python to be pretty intuitive, outside of the odd ambitious list comprehensions. I worked in a golang shop for several years, so it's not just an familiarity situation either.
We are just different. That's not something to be mad about.
In my opinion most interpreted languages today tend to produce very dense code. Fancy call chains and closures interleaving. If you look for a subtle bug those are hard to reason about, you have to know the details of a lot of different APIs.
Go is verbose partly for that reason, but a silly loop is a silly loop. The constraints are clear, you only have to do the logic.
Any language that uses error codes instead of exceptions is a non-starter for me. Produces code that craps all over the happy path.
Python has a different problem: it is slow as f---. I did a micro benchmark comparison against 5 other languages in preparation for my python replacement language. Outside of dictionary lookups, it is 50-600 times slower than C depending on the workload.
Go, Rust etc are fine. They land at 1.25-3x slower than C. But I prefer the readability of python minus its dynamic nature.
Exceptions are not the only alternative to error codes.
Many early programming languages handled errors by providing multiple return addresses to functions.
Thus a function returned to the place of invocation in the normal case, otherwise it returned to one of the error handlers that were grouped at the end of the function that invoked it, away from the happy path.
This was more efficient than the modern way of returning an error code, as it eliminated the superfluous testing of the error code and the expensive conditional jump based on the result of the testing.
Moreover, in my opinion in the majority of the cases maximum information about an error is inside the function that has invoked the function where the error happened and neither in an exception handler several levels above it, nor inside the invoked function, so the place where the error handlers had to be put in this old method is actually optimal for meaningful error messages.
From the point of view of the source text, this old error handling method is equivalent with using exceptions with the constraint of placing the exception handlers in the function that has invoked the function that generates the exception. This limitation enabled a more efficient implementation than for exceptions where the handler can be placed anywhere above the invoked function and a complex stack unwinding may be necessary.
For personal projects, yes. For code going into production, you still need human code review, and that has to happen in a language that the humans you've hired are comfortable with. One day, we'll all be YOLOing vibe code straight into production, but that day is not today.
nothing about the performance characteristics of python changed with AI so why would you use python over golang if performance is a requirement/bottleneck? Trying to understand the reasoning as to me golang and python are equally simple to write and understand.
Regardless of whether golang and python are actually equally simple, python certainly has the reputation of being easier to write and read than almost any other language. That is a big part of its popularity.
Python is not really simple though, the semantics are actually quite bonkers. It just has "simple"-looking syntax, but that only helps you for trivial programs where the bonkers semantics does not get in the way.
I think we'll eventually be generating machine code directly. But until then we should be using code that our team can actually read and understand. If you know go, then that works you, Not everyone does.
Doubt it. LLMs will always be more expensive per-token than compilers, and high level languages need fewer tokens than machine code. Also, type systems, warnings, overlap with natural language in names - those are very useful.
A couple of services looked like they had a memory leak. Memory was continuously increasing over time. Thanks to Python 3.14, we were able to use memray to understand what was going on. Those services were recreating HTTP clients (aiohttp) for every inbound request, and memory allocated by the downstream SSL lib was growing faster than it was being released.
We ended up rolling back to 3.13, which fixed the issue. I'll try again with 3.14.5.