Hacker News new | ask | show | jobs
by otabdeveloper 3450 days ago
Systemd isn't an init system, it's a service (a.k.a daemon) management daemon. Its primary purpose is to restart and diagnose failing daemons cleanly.

Systemd won for one simple reason: it's the only tool that accomplishes this task without bugs. We've been running daemontools for almost a decade in production, and it's a nightmare of bugs. Very glad to be finally switching to systemd.

6 comments

> Its primary purpose is to restart and diagnose failing daemons cleanly.

If this is true, and speaking as a systemd user for close to five years now, it universally sucks at its primary purpose.

Specifically, whenever a service fails, I've lost count of the number of times systemd has barked out useless errors with 200 lines that boil down to "service has entered failed state". Whenever a systemd service fails, odds are better than even I have to spend two hours debugging why by enabling internal logging in that service, running in debug mode etc.

Like when I tried to switch to networkd, and had the wrong password for a wifi I was connecting to, networkd never told me this in any way I could find. Had to go back to the old solution (after an hour of pulling my hair out) before I realised the password was wrong.

Networkd does not handle WLAN authentication. This is the job of wpa_supplicant, which is the defacto standard on linux in every setup until maybe iwd from Intel takes over.
Yeah, but I'm sure you agree networkd should propagate errors from wpa_supplicant such that they reach the user, instead of piping them to /dev/null (not literally, but you get my point)?
systemd-networkd doesn't know about wpa_supplicant. You would start wpa_supplicant@wlan0.service and see wpa_supplicant errors there.

networkd only springs into action when wpa_supplicant succeeds in establishing the layer 2 connection and the interface becomes UP. I like the wpa_supplicant+networkd combo precisely because of this decoupling between network layers. One day, I'll get off my lazy ass and replace NetworkManager by wpa_supplicant+networkd on my notebook.

networkd doesn't know about wpa_supplicant, just as it doesn't know about openvpn, vpnc, ...

If you want a network manager that does know about those and might give more helpful error messages if they fail, use for example NetworkManager.

Then perhaps networkd should be dropped, because:

repeat after me:

everything eventually fails.

How can you tell when a programmer has graduated from "completely new at this" to "has some valuable experience"? That point comes when they stop assuming success.

Check for error and do something useful with the returned value.

Write tests yourself.

Fail gracefully.

Log status, so you know what was happening just before it failed.

Set reasonable timeouts on external processes.

Systemd is written from the perspective of a laptop user who will hand over the whole thing to a support tech when things go wrong. This is antithetical to the spirit of UNIX, which is not "write programs with one purpose that chain together well".

The spirit of UNIX is this: At any time, a user on the system may decide to become a developer or a sysadmin. The tools and information they need should be available.

`ifconfig`, `ip`, `dhclient` or Debian's `ifupdown` don't care about errors from wpa_supplicant either. Let's drop them too?

Or actually all of them (including networkd) work fine, but are not the right tool for every usecase.

Indeed, I'm not using networkd anymore. And it was just an example, the phenomenon exists all over systemd.

So basically, you're saying that systemd et al integrate with everything on my system, except for when it's useful?

No, I say that networkd is not the right tool for every usecase. For some it is nice, for others not.

Not that different from other tools like `ifupdown`, `NetworkManager`, `wicd`, `connman`, ...

And "not knowing about" a service is enough of an excuse to hide any errors of those services it's configured to run?
Umm, how did you configure networkd to run wpa_supplicant?

Hint: Networkd doesn't run wpa_supplicant. So it can't "hide" anything about it. Or only as much as `ifconfig` "hides" errors from wpa_supplicant.

systemd works somewhat ok for me, because I decided to stick to a very limited subset of the functionality it attempts to provide.

I've also used runit (which follows the daemontools model) as a service manager and I've never had an issue with that. I may just have been lucky though.

For me systemd fails because of some bugs I have repeatedly experienced: - systemd stops reaping zombies for some reason; the OS's PID table becomes full and it's impossible to create new processse -> need to hard reset the machine - systemd overtakes halt/reboot commands which is ok when it works I guess -- but for some reason I sometimes get "operation timed out" (likely because of some bug in systemd or dbus). At this point I have to hard reset the machine to get back to a usable OS.

Imagine if #2 happens to you on a remote machine. In my case, I had to call someone and ask them to reset the machine.

#1 is unthinkable for me, because it's the second thing init is supposed to do (the first one being bringing up the various services). I've never had this happen to me on older Linux or other Unices because let's face it, it's not that hard to do. At that point I honestly thought "how can I expect systemd to provide all the features it boasts when it can't do the easy things well ?".

I've also had daemons that systemd lost track of, apparently because of a wrong setting in the .service file. Now that's not a systemd bug, but it was very difficult to debug because I couldn't "trace" the process starting. On the other hand, with daemontools/runit it's quite simple: manually execute the ./run script and see where it fails. With classical init, run the /etc/init.d/service with sh -x and you see exactly where it fails.

if you need to hard reset remotely, and you've still got a shell:

# echo b > /proc/sysrq-trigger

I've used them, too, but while they always seem like a hassle, I haven't encountered any bugs. What are the bugs that you've found?
> Its primary purpose is to restart and diagnose failing daemons cleanly...it's the only tool that accomplishes this task without bugs.

I've been running runit in production for many years, and it does just this, flawlessly.

Systemd is definitely an init system.
As well as logging daemon, dbus daemon, session manager, device node manager and many other things.
That session manager, logind, is a particular mess.

Here you have a daemon that ties into PAM that tries to second guess the kernel regarding what constitutes a session.

Effectively systemd is becoming something akin to Android. It may be using the Linux kernel, but it is not the GNU/Linux we have grown familiar with over the years.

I don't think the kernel has a session concept that corresponds to the session concept of PAM or systemd. (Yes, PAM also has sessions.)

It has some sort of sessions for processes, but that is just something sharing the same name, not the same concept.

systemd session handling is also particularly joyful to debug.
What bugs are you seeing with daemontools? I've been running it in production for 15 years now (shortly after djb released it) and it's been rock solid the entire time.