Hacker News new | ask | show | jobs
by kahseng 2725 days ago
Reminds me of a time at Quora in 2011 where we saw Python GC impact 99th percentile server-side site speed. So drawing from HFT inspiration where some companies would disable JVM GC during trading hours and perform them offline, I thought about how to take some backends periodically offline in order to have GC not happen on user requests. A simpler operational solution emerged though where I just had to disable GC on user requests and make it happen only on a special "/_gc" endpoint. I then dual purposed the frequent nginx/haproxy backend health-check functionality to use that endpoint, thereby ensuring all backends had frequent GC and the time spent there only impacting the health check requests, and not that of end users.

edit: added more details I remembered later

2 comments

In the Ruby world this is called "out-of-band GC" and is supported by multiple application servers: http://tmm1.net/ruby21-oobgc/
This is a very interesting approach, what happened with memory footprint when you did this?
Thanks, don't think I saw much impact at all in aggregate - our memory consumption on these web servers were dominated by objects we intentionally stored per request or globally, and not temporary/unreferenced python objects.
Even while GC is delayed Python (CPython at least) will free some stuff through reference counting. Only circularly referenced stuff should stick around until the next GC run. So that can avoid lots of stack temporaries and stuff.
Theoretically with your code structured right you can disable the cyclical garbage collector outright: It only deals with reference cycles which you can explicitly avoid by using the weakref module.

Not entirely sure how you'd go about writing code like that, but it's possible.

Nice! I'll have to give this a try at some point if I run into GC related latency issues and see if it works on my systems.