Hacker News new | ask | show | jobs
by 0xFFFE 1287 days ago
It appears the problem existed before the lila3 was deployed on 11/22. If you notice the GC graph. The number of GC cycles/minute kept increasing gradually starting on 11/10 and almost doubling on 11/21. The 11/22 deployment of lila3 reset the graph and since you have been restarting everyday since, we can't see the growth of more than a day. My wild hunch is a code push on 11/10 causing a memory leak, worth checking in my opinion.
2 comments

I haven't looked at the graphs, but they updated netty to 4.1.85.Final on 2022-11-10.

https://netty.io/news/2022/11/09/4-1-85-Final.html mentions that a potential memory leak was fixed. In includes a debug log warning of the leak; but enabling debug logging may be a no-no.

Perhaps could be worth reverting that and see if there's any change. Sounds cheap and harmless.

Netty is a super complex and also super poorly documented project. I did weeks of exploration and found these JVM args:

      -Dio.netty.allocator.numDirectArenas=0
      -Dio.netty.noPreferDirect=true
      -Dio.netty.noUnsafe=true
work pretty well for us on any HTTP server. They slightly reduce performance as HTTP pool is weaker, but deceases memory usage by 25-40%, also eliminated one of a few memory leaks in an older version of KeyCloak.
there were no deployments near 11/10, according to that graph - they also say in the blog that scala2 could go for two weeks without restart, so theyre presumably aware of some sort of memory management issues they're just okay with it.