Hacker News new | ask | show | jobs
by Zefiroj 495 days ago
The author has identified the issue to be memcg reparenting causing a spike in CPU usage. Reparenting mostly solves a problem with zombie memcg, where the memcg lingers because some resource is still charged to it. In the extreme case you can end up with tens if thousands of zombies. The zombie memcg problem is not unique to cgroup v2, but reparenting is fairly recent.

The article solves the cpu spike by disabling the io or memory controller, but if one would like to use those controllers, a better way to charge memory would be nice.

It is unfortunate that it's clear where the memory should be charged, but the kernel does not provide reliable way to deterministically charge that memory. If anyone has any design ideas, please feel free to chime in!

1 comments

looking at the linked launchpad bug it seems the issue is lock contention.

simply pacing the reparenting would solve the problem, and reworking the locking (to allow the reparenting process to work in batch maybe?) would make sure it finishes relatively quickly.