Hacker News new | ask | show | jobs
Some Reflections on Writing Unix Daemon (tratt.net)
97 points by ltratt 845 days ago
8 comments

> Rather than having processes detach themselves from the terminal, these managers run daemons as if they were normal (albeit long-running) programs. This slightly simplifies the daemons themselves and provides a more homogeneous experience for the user.

It tremendously simplifies the daemons themselves and indeed does provide a more homogeneous experience for the user. Remember: do one thing only, and do it well. The "daemonizing" part is is a second thing and it belongs in a separate utility. If the user wants to run your daemon "interactively" (e.g. being able to press Ctrl-C to stop it), they should be able to do so by simply running your daemon from the shell. If the user wants to run your daemon "in the background", whatever it means to the user, they can arrange so by themselves. Why is it such a difficult idea for the "daemon" writers to accept is beyond me: there is negative value in a program forcefully detaching itself from every single way to supervise it (double- and triple-forking defeats supervising via wait(), closing FDs defeats watching for the closing of an injected pipe, etc).

Not to mention that in order to properly start and stop such double-forked daemons, you need everyone to agree on how pidfiles work, where to store them, how to garbage collect them, where to write logs, their permissions, how to rotate them, etc etc etc.

With the systemd/“just stay in the foreground” approach you just stay in the foreground and the process monitor can just wait() on you. And for logging you can just write to stdout and let the process monitor unify all the logging. Do one thing and do it well indeed.

> Why is it such a difficult idea for the "daemon" writers to accept is beyond me

Because the old documentation was good, and the new documentation is awful.

The old documentation often wasn't good.

But if you're trying to replace an existing system, it's a mistake to think "the old system had poor documentation so it's OK for the new system's documentation to be poor too", because people have already learned to cope with the old system.

I've seen people make that mistake distressingly often in recent years.

> Why is it such a difficult idea for the "daemon" writers to accept is beyond me

There's plenty of old documentation out there saying things like "a proper daemon completely detaches from its controlling terminal, opens log files, etc."

There is also plenty of old documentation that strongly advises against doing that, see [0] for references (e.g. the aside with an excerpt from the 1995-era AIX docs).

[0] https://jdebp.uk/FGA/unix-daemon-design-mistakes-to-avoid.ht...

> Why is it such a difficult idea for the "daemon" writers to accept is beyond me

Because the basic OS interface is a 50-year-old design with "terminals" being the central I/O entity.

Differentiating between "interactive" and "daemon" mode becomes tricky for a daemon writer. In interactive mode, SIGHUP must terminate your process. In daemon mode, you may interpret it any way you like. Same for SIGINT, SIGQUIT, SIGPIPE, etc.

Tricky? Checking if you're bound to TTY is trivial, checking whether you have a parent is trivial, checking whether interactive shell is somewhere in the tree is trivial, having some if's in signal handlers is trivial, etc.

Besides "daemon writers" could just default to foreground, because if it's ran as a daemon it is ran from a script hence nobody cares for extra option typing such as -d --daemon, it's written once in a file, as opposed to doing "program --foreground" every time you want it to run in front of you

SystemD took away the pain. You no longer have to think and reinvent the demonize, logging and so on. Just just start what ever you want in a while(1) loop and write to stdout and stderr. No log rotate nightmare, etc.

This makes daemons easier to debug, as they run just in foreground so that you can start them from the cli to thinker with them.

SystemD is highly discussed and some love and some hate it. But such improvements are unmatched and extremely helpful.

OpenRC, daemontools, and many other service managers provide daemonizing wrappers, and all substantially predate systemd.

The Systemd Cabal are far from the first to have noticed and attempted to resolve most of the problems they tackled... they do have the most effective "developer relations" team of all of the projects in this space, though.

(I do continue to love their old "It's so much better than System V Init!" talking point... as well as their "You'll never have to write a shell script to start or manage a service again! Declarative, INI-formatted Unit Files For Everyone!" one.)

The systemd cabal one-upped all other efforts that pre-dated them by leveraging cgroups such that what rlimits your services run with is dictated by the unit file alone, and not by whatever your user session had when you issued the start or restart command.
And they did so by sweeping the entire UNIX landscape into the dustbin, only supporting linux, and in return giving us a management system that can't be sure anything is running[1].

systemd is leaps and bounds better than most of the init systems that came before it, but it's not a good daemon supervisor, and the developers have basically declared that it's "good enough" so improvements are unwelcome.

1 - http://ewontfix.com/15/

What link is a bit weird. https://www.freedesktop.org/software/systemd/man/latest/sd_n... documents exactly what is sent and how. You can implement that function yourself very easily. Their edge cases are solved by opening the notification socket before you chroot. Writing to a socket would be just as systemd-specific as the current solution if that's what systemd would expect.
Also, do you happen to know how, say, macOS manages service dependencies with launchd and makes sure everything is running?

It doesn’t.

It explicitly says in the scarce documentation that it doesn’t care, and that interdependent services should use IPC and in general figure it out between themselves. launchd only launches the processes as soon as it can, and restarts them based on a boolean flag if they die. That’s all.

systemd at least gives you something to base your expectations on.

Adding a mechanism for dependencies which kind of sometimes sort of works but doesn't actually work is much much worse than just not implementing anything at all as you are much less likely to get anyone too put in the work required to do the hard thing they need to do for it to actually work.
Well, do systems like, say, FreeBSD even support facilities functionally equivalent to control groups? Even if they do (which they don’t), it would be a whole new separate implementation, and there’s not enough interest from anyone who can do it to roll their sleeves and work on it.
FreeBSD, specifically, has had jails and Capsicum longer than systemd has existed, and rctl (aka Resource Limits) for about a decade now.

Like I said, systemd is a great init system but it really did not bring anything to the table as far as service management. Apple's abdication of the problem makes sense since they're tightly focused on workstation ever since killing the Xserve line in 2011.

Answer to question - yes, and more. Jails+rctl (available since 2012) is not cgroups it's cgroups+SELinux+APParmor. Vanilla linux container is not a security barrier, vanilla FreeBSD jail is.

In practice this means more seamless 'isolation' in Linux case but that isolation is weak. Which perfectly corresponds to FreeBSD looking at server uses 99% of the time and Linux looking at the desktop too.

About your conclusion, I don't think that's based on anything so please do write on what facts do you base the assessment that FreeBSD has no resource limiting and isolation features, and that it would be a 'separate implementation', as FreeBSD always tends to upgrade and not change tools and interfaces, and that there is not enough interest from anyone to implement it, as most major FreeBSD features are actually paid for by FreeBSD sponsors.

And I think "daemonize" might have been the first? One of the earliest at least, it turns a single process into daemon but doesn't have an overall system command to handle them.
To make daemons easier to write, more uniform, and secure, djb created https://cr.yp.to/daemontools.html and used it for many of his projects

In the corresponding FAQ list, he says this about daemons that detach themselves:

> How can I supervise a daemon that puts itself into the background? When I run inetd, my shell script exits immediately, so supervise keeps trying to restart it.

> Answer: The best answer is to fix the daemon. Having every daemon put itself into the background is bad software design.

> Ever since I have tried to think “can I solve this problem in a way that reuses known conventions or do I really, really have to do something different?” The answer, once I can control my ego, is nearly always “I don’t need to do something different”.

This really is the way of wisdom. Not that there’s not room for improvement — there is — but it is normally more effective to attack the problem at hand than try to solve ancillary problems as well.

It's Unix parlance for a forked child process that detaches from its interactive terminal with the setsid() syscall, becomes a child of init, possibly closes or redirects the 0/1/2 file descriptors to logging facilities, possibly changes its CWD, and no longer receives terminal-originating signals like SIGHUP, SIGTTIN, SIGTTOU, etc.

Apart from that, a robust non-daemon program should use the same defensive techniques. In a short-running command-line utility, it might seem "acceptable" to leak memory, not close resources and not do adequate error-checking, but that's just sloppy programming.

Well...

"Since the missile will explode when it hits it's target or at the end of it's flight, the ultimate in garbage collection is performed without programmer intervention."

https://groups.google.com/g/comp.lang.ada/c/E9bNCvDQ12k/m/1t...

"to my surprise the cp command didn't exit. Looking at the source again, I found that cp disassembles its hash table data structures nicely after copying (the forget_all call). Since the virtual size of the cp process was now more than 17 GB and the server only had 10 GB of RAM, it did a lot of swapping."

(I believe GNU cp was updated to not free up structures at exiting due after this).

https://lists.gnu.org/archive/html/coreutils/2014-08/msg0001...

> My experience with snare and pizauth is that Rust is a viable language for writing daemons in. Rust isn’t a perfect language (e.g. unsafe Rust currently has no meaningful semantics so any code which uses unsafe comes with fewer guarantees than C)

What exactly does the author mean when they say that unsafe Rust has "no meaningful semantics"? Is this a term of art in language analysis or is the author just saying "it's weird"?

As an example, I like to point people at https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html which for many years now has contained this line:

> The precise Rust aliasing rules are somewhat in flux, but the main points are not contentious

I've sometimes found myself in situations where the only way I've been able to deal with this is to check the compiler's output and trawl forums for hints by Rust's developers about what they think/hope the semantics are/will be.

Historically speaking, this situation isn't uncommon: working out exactly what a language's semantics should be is hard, particularly when it has many novel aspects. Most major languages go through this sort of sequence. Some sooner or later than others --- and some end up addressing it more thoroughly than others). Eventually I expect Rust to develop something similar to the modern C spec, but we're not there yet.

Excellent - thank you for the example and the clarification. This is exactly what I was looking for.
Rust as a language is in practice defined as whatever rustc does, there is no authorative specification like ISO C standard
It's a weird claim. See https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html

> It’s important to understand that unsafe doesn’t turn off the borrow checker or disable any other of Rust’s safety checks: if you use a reference in unsafe code, it will still be checked. The unsafe keyword only gives you access to these five features that are then not checked by the compiler for memory safety.

Fully defined rust or not, unsafe code blocks definitely do not have fewer guarantees than C.

> There are use-cases for async/await, particularly for single-threaded languages, or for people writing network servers that have to have to deal with vast numbers of queries. Rust is multi-threaded – indeed, its type system forbids most classic multi-threading errors – and very few of us write servers that deal with vast numbers of queries.

This is the portion that I was seeking his feedback on. Rust originally was designed to work with OS native threads. Later Nginx like software emulated threads were proven far more efficient and fast. Here a single thread jumps from connection to connection, instead of waiting for the other side to respond. In C you can do it but you need the ability for a function to resume from the mid of its body where it left off. That asks for "co-routines" which even though possible makes the code complex.

Rust solution is async/await. But now they have two solutions, the more integrated multi threads and then this newly introduced single threaded async/await. It's better to get feedback from people that have worked on it and their good or bad experience.

The most common Rust async runtime, tokio, is multithreaded by default.