Hacker News new | ask | show | jobs
by masklinn 3756 days ago
That doesn't have anything to do with Poettering's quote.

PR_SET_CHILD_SUBREAPER moves the ownership of an orphaned process to whichever process was selected rather than the default PID1, and that only works for descendant of the subreaper.

The problem pointed by the quote is that normal software doesn't go around checking if it has zombie children and waiting on them, so in a container with random software S set as PID1 and creating subprocesses, zombies may accumulate until resources are exhausted[0].

PR_SET_CHILD_SUBREAPER is a way to cause that problem on a system with a proper init (or to test that your init works properly without needing to boot into it)

It's not a new observation: https://blog.phusion.nl/2015/01/20/docker-and-the-pid-1-zomb...

Previous HN discussion: https://news.ycombinator.com/item?id=8916785

[0] by default the limit is 32k processes after which the kernel will simply refuse to create new ones

2 comments

Yes it does. He's claiming that systemd should manage the container processes as pid1, because systemd will then clean up the zombies. But anything that reaps zombies can be pid1 -- systemd isn't special in this regard. And even if you did use something that didn't reap zombies as pid1, you could leverage PR_SET_CHILD_SUBREAPER as some other non-pid1 process to grab zombies for descendants it spawns.

If you do use PR_SET_CHILD_SUBREAPER, then you need to reap whatever gets reparented to you; if you don't do this then the process table will eventually fill up with zombies. He is correct that few programs do that, but there's nothing that requires that to be done by pid1 if all the processes within the container are spawned by something that provides that functionality and uses PR_SET_CHILD_SUBREAPER.

> Yes it does.

No, it still doesn't, sorry.

> He's claiming that systemd should manage the container processes as pid1, because systemd will then clean up the zombies.

The part that'a quoted only notes that PID1 is responsible for reaping orphaned zombies, that Random P. Application Process most likely doesn't do that, and that it causes problems.

> But anything that reaps zombies can be pid1 -- systemd isn't special in this regard.

The part you've quoted doesn't try to claim otherwise.

> And even if you did use something that didn't reap zombies as pid1, you could leverage PR_SET_CHILD_SUBREAPER as some other non-pid1 process to grab zombies for descendants it spawns.

That's a completely inane claim, the whole point of the article is the issue of people starting their application process as PID1, what are you suggesting, that applications should be modified to spawn an init which would use PR_SET_CHILD_SUBREAPER to which it would delegate spawning subprocesses? That's utter lunacy. Have some decency and regard for basic sanity and the context in which the quote appears.

> If you do use PR_SET_CHILD_SUBREAPER, then you need to reap whatever gets reparented to you; if you don't do this then the process table will eventually fill up with zombies. He is correct that few programs do that, but there's nothing that requires that to be done by pid1 if all the processes within the container are spawned by something that provides that functionality and uses PR_SET_CHILD_SUBREAPER.

Are you just making that hare-brained bullshit on the spot so that you don't have to admit your original comment was wrong?

What's the point of spawning a broken PID1 just so you can spawn a process using PR_SET_CHILD_SUBREAPER and doing the actual reaping correctly? Just spawn that as PID1 in the first place FFS.

This is true, but what if the thing that spins up the actual container process sets this?
What do you mean "the thing that spins up the actual container"? The root process for the container? It's already PID1. The external process creating the container? It's sitting outside the container and "below" PID1, what could that do that'd be of any use?