Hacker News new | ask | show | jobs
by bcoates 337 days ago
Does anyone actually use journald? The last time I tried(2 years ago?) it didn't even work with any log management software (like cloudwatch for example).

You had to either use some (often abandoned) third party tool or defeat the purpose by just reconfiguring everything to dump a text log to a file.

8 comments

I use journald whenever I feel my blood pressure getting too low.

It's slow, truncates lines, and doesn't work well at all with less. It's almost like Pottering created it so that pulseaudio wouldn't be his worst program anymore.

One convenience of journald is that it exposes a single place to plug in log collection for observability tooling

opentelemetry-collector, promtail, and so on have native plugins for it, which makes aggregation easier to setup

Most tools have "tail this plaintext file" as well, but if it's all flowing to journald, setting up log collection ends up being that much simpler

That is what syslogd has been doing since forever. Journald actually made this harder due to not supporting the established syslog protocol.
And a lot of software don't use syslog because it's easier to print to stderr/stdout or some random log file. Journald makes it easier to capture everything no matter what the software does, including the established syslog protocol, so I don't even see your point.

If everything on your machine uses syslog, journald is a drop in replacement for the dozen of possible syslogd implementations.

journald isn't drop-in because it only saves logs locally. syslog also is a protocol to send your logs to a log-server.

And the usual syslog APIs are 2 lines: Initialize with openlog(stream, process_name, flags) and after that do syslog(urgency, message). That is on par with stderr/stdout, and far simpler than handling your own logfiles. Except if you use log4$yourlanguage or something, then everything is just the same, you just configure a different destination.

And if you can't change your code to not use stdout/stderr, you can easily do yourcode | logger -t yourcode -p daemon.info 2| logger -t yourcode -p daemon.err

Pipe it into lnav eg:

journalctl -b | lnav

or use -f instead of -b for follow instead of everything since boot. Now you have a colourised journal and the power of lnav.

I architected and built our entire log-ingestion pipeline for intrusion detection on it, at Square.

I built a small Ruby wrapper around the C API. Then I used that to slurp all the logs, periodically writing the current log ID to disk. Those logs went out onto a pubsub queue, where they were ingested into both BigQuery for long-term storage / querying, and into our alerting pipeline for real-time detection.

Thanks to journald, all the logs were structured and we were able to keep a bunch of trusted metadata like timestamp, PID, UID, the binary responsible, etc. (basically anything with an underscore prefix) separate from the log message all the way to BigQuery. No parsing, and you get free isolation of the trusted bits of metadata never intermingling with user-controlled attributes.

Compared to trying to durably tail a bunch of syslog files, or having a few SPOF syslog servers that everyone forwarded to, or implementing syslog plugins, this was basically the Promised Land for us. I think we went from idea to execution in maybe a month or two (I say “we” but really this was “me”) and rolled it out as a local daemon to the entire fleet of thousands. It has received—I think—one patch release in its six+ year lifetime, and still sits there quietly collecting everything to be shipped off-host.

The only issue we’ve ever really ran into that I never figured out is a handful of times per year (across a fleet of thousands) the journald database corrupted and you couldn’t resume collecting from the saved message ID. But we were also on an absolutely ancient version of RHEL, and I suspect anything newer probably fixed that bug. We just caught the error and restarted from an earlier timestamp. We built the whole thing around at-least-once delivery so having duplicates enter the pipeline didn’t really matter.

Damn, honestly at this point I’m wishing I’d pushed to open source it.

Ironically, actually, I did write a syslog server that also forwarded into this pipeline since we had network hardware we couldn’t install custom services onto but you could point them at syslog. I also wrote this in Ruby, using the new (at the time) Fibers (“real” concurrency) feature. The main thread fired up four background threads for listening (UDP, UDP/DTLS, TCP, TCP/TLS), and each of those would hand off clients to a dedicated per-connection worker thread for message parsing. Once parsed they went onto one more background thread for collecting and sending to PubSub. Even in Ruby it could handle gazillions of messages without breaking a sweat. Fun times.

Since I’m rambling, we also made cool use of zstd’s pre-trained dictionary feature. Log messages are small and very uniform so they were perfect for it. By pre-sharing a dictionary optimized for our specific data with both ends of the pubsub queue, we got something like 90%–95% compression rates. Given the many terabytes of logs we were schlepping from our datacenters to GCP, this was a pretty nice bit of savings.

For debugging my desktop:

    journalctl --follow --tail --no-trunc -b 0
For anything else: export to a syslog server, which basically any tool that matters will support in some fashion.
journalctl: unrecognized option '--tail'
Well, everyone and noone uses journald. The usual way to log is journald -> (r)syslogd -> remote log destination(s). I've never actually seen any other way.

The preferred journald way of "fetch your logs periodically" wouldn't pass any audit and doesn't work with any kind of log processing software that people use.

I use it everyday?
Cloudwatch fucking sucks.

Plenty of log shippers can slurp journald. (Fluentd, filebeat, vector)

Even ChromeOS uses it, even on devices that still use Upstart.

> Even ChromeOS uses it, even on devices that still use Upstart.

I was curious about this, because I thought journald hard required systemd as pid 1, so I did a search, which promptly turned up https://www.chromium.org/chromium-os/developer-library/refer... -

> Jounald is deprecated and is about to be removed.

Hmm couldn't exactly tell when it was removed, but it looks like it lasted maybe 3-4 years. This is the commit that added it to the upstart config.

https://chromium.googlesource.com/chromiumos/platform2/+/870...

Fascinating. An initscript for journald is a special kind of cursed that I didn't expect to read today:)
You're being downvoted but you're absolutely right. The fact that Cloudwatch doesn't support journald is a major, major fail on AWS' part. It's not like this is new or obscure software.
I'm just surprised anyone wants to use cloudwatch when they don't need to. It is expensive, and far from the best observability platform.
It's always there by default. That's the only reason I've ever seen it used.