Hacker News new | ask | show | jobs
by pjc50 4055 days ago
Binary logs may be fine for you, but don't force it on us!

This is really the important point here. For small systems, grep works fine. The number of people administering small systems is much greater than the number of people administering large systems. The systemd controversy has caused people to fear that change they don't want will be imposed on them and their objections insultingly dismissed: a consequence of incredibly bad social "change management" by its proponents.

They are therefore deploying pre-emptive rhetorical covering fire against the day when greppable logs will be removed from the popular Linux distributions. Plain text is the lingua franca; binary formats bind you to their tools with a particular set of design choices, bugs and disadvantages. My adhoc log grepping workflow has a different set of bugs and disadvantages, but they're mine.

4 comments

>For small systems, grep works fine

That really the key for me. My go to example is searching for IP numbers across different logs. If I have just one machine, and I want to find an IP in the SSH, web and mail logs I shouldn't have to use multiple tools for getting that data.

Logstash, Splunk and other tools store stuff binary, as he writes, and that's perfectly valid, the only solution in fact. But I don't want to be force to run a centralized logging server, if I have just the one or two servers.

If it's okay to claim that binary logging is the only way to go, because you have hundreds of servers, it's also okay to claim that text files are the only solution, because I just have one server.

Finally, isn't those binary logs (those that come from individual services) going to be transformed into text when I transmit them to something like Splunk, only to be transformed back to some internal binary format when received? It seems we could save a transformation in that process.

In the setup the author presents, using syslog-ng and elasticsearch, it seems the logs are serialized as json for the transmission.
Yes, which means that if say systemd logs where to be shipped to his ElasticSearch instance, he need to configure Journald to log to text files first, and then what's the point of having the binary format?

Yes, ElasticSearch is storing data in binary, and that's fine, but you're not going to ship the raw Systemd binary log to ElasticSearch, nor any other binary logs for that matter.

In fact in the examples he provides both sources are plain text. Syslog-ng and Apache are plain-text logs. He then transfer them to ElasticSearch, where they're store binary, but that's not what anyone is complaining about. The original source should be text, what you choose to do afterwards is your business.

Oddly enough, even for large (>=1e5 physical machines) systems, grep works fine. Better yet, if the logs are important, you're shunting them off for some sort of longer-term storage for post-processing and indexing _anyway_, irrespective of the underlying disk format. Some folks continue to use plain text even then, just with some distributed systems magic wrapped around the traditional Unix tools.

(If you're shunting _all_ of your log data off at that scale, you're crazy, and you'll melt your switches if you aren't careful.)

The name of the game is to think of the problems that you're solving and how they relate to the business bottom line. No sooner, no later. Additionally, what's most troubling is that we've turned this exercise into an emotional one, not one with any sort of scientific-oriented perspective.

I can personally say with conviction that I'd like to sit down and actually collect data on, e.g., how many instructions it takes to store logs to disk in plain text versus a binary format, how many it takes to retrieve logs from disk in both situations, and how much search latency I incur when trying to retrieve said logs from disk in the same. At scale, which is where most of my attention lies these days, that's the kind of thing that matters because those effects get amplified automatically—often to operators' and capacity planners' horrors—by the number of machines you have.

If you're dealing with smaller systems, it won't matter as much, but at that point, you're probably dealing with the other side of this, which is having information on how many requests you get for historical log data and what sort of criteria were used in that search. If you're getting requests less frequently than, say, once per quarter, it likely wouldn't be worth your time to invest in what Mr. Nagy is evangelizing.

tl;dr: Continue using your ad hoc grep-fu, but be mindful of how much time it takes you to get the data you're looking for. That alone will be your decision criterion for adopting something like this.

grep definitely breaks down on large systems. I have one environment with approx 5 million nodes - (1e6), and the only way to coherently manage the log updates from them is in binary format.

But even still - I like to have the text files as journals of original entry - so I can occasionally do a tail -f incoming.log| egrep -i "somedevice".

And having the original files in text format is zero impediment to getting them into handy binary database form.

I hate arguing semantics, but 1e6 is not just large but very large indeed. (:

That said, I'd be curious to know some more of the details of that system actually! If you're aggregating all of those devices together, using something binary in that context definitely makes sense. In fact, if I were in your shoes and tasked with designing some means of solving that problem, I would probably use something like protobuf or capnp to emit those messages since they're well-known and well-understood serialization mechanisms.

Now, that's the integration and aggregation side of this exercise.

On a local node-by-node basis, though, I absolutely agree; having the raw text as journals of original entry for inspection in real time with `tail -f` (or, if you're using multilog, `tail -F`…) would still be incredibly useful.

Going back to Mr. Nagy's article, the space of problems that `tail -f` solves is barely overlapped by the space of problems solved by aggregation. I think he's conflated the two spaces in his article here (and especially in the one previous) whereby he's applied a one-size-fits-all solution to both where it demonstrably does not fit all.

The remote nodes all log to central DNS servers, and Trap Servers. The DNS servers have a nice update.log file that provides their IP address information, and some nice text configs. The trap data, goes into a binary file (database actually) and requires analysis through a web interface.

As a result - the DNS updates are used by me approximately 20x more often than the trap data, when doing diagnostics, even though, in theory, the trap data is incredibly richer, and, of course, has the 15 mandatory fields that are functions of the binary logging. (Time, Date, Event ID, Trap Type, etc, etc...)

Memories of supporting subscriber CPEs and having to go through Drum to analyze the data coming out of logged SNMP traps/notifications are flashing back. Thanks for that. (:

But, yeah, assuming that the nodes in discussion here are not amd64 machines but are instead subscriber CPEs, that's a totally workable (and, frankly, agreeable) solution.

"The number of people administering small systems is much greater than the number of people administering large systems"

Do you have any evidence for this statement? Because it sounds all kinds of wrong.

Power-law distribution and economies of scale?

There are a lot of hobbyists, a vast number of people with a Linux box in the corner of the office or a few cloud instances, a smaller number of people running IT for multinationals and one or two people who have whole datacenters to themselves. The larger the system, the lower the computer/human ratio.

I would tend to agree with the OP but with a caveat - most of the people who administer system work on small systems, while most people who's full time job is administration work on large systems. Basically there are an awful lot of people in the world who's job description includes part time system administration.
By definition just about everyone who's actually working in DevOps counts.
http://www.internetlivestats.com/total-number-of-websites/

If I read it correctly there are about 250 million active sites (roughly). It seems unlikely that they are all massive corporate sites.

As an aside, the idea that systemd is a good thing is hilarious to me at the least because it is so brash about making an important change to a huge chunk of the system. Yes the bugs will eventually get ironed out, but in the meantime? Count me out! I have work to do and am not interested in being a free tester for Redhat on my live systems.

I'm pretty sure that counts (eg) each wordpress.com subdomain as a separate website. [1] counts like that and gives a roughly comparable number.

That gives a lot of economy scales.

[1] http://news.netcraft.com/archives/category/web-server-survey...

The link I included states that those are unique hostnames. Perhaps they are including subdomains on the same ip address, but you might note that rather than quoting the 1 billion sites, I reduced that by their estimated 25% being actually active. Additionally they state that there are on average 3 users per site in 2014. Maybe that doesn't mean anything, but as a rough estimate that all implies far more small sites than large ones.
...systemd?
You don't need evidence for the obvious. There are a few million personal desktop pcs with linux on them, then there are single servers used by exactly one person. Count that against the people working as a professional sysadmin on a big system.
For sure the storage format should not hinder you from using grep if you want. Even with systemd you can pipe journalctl's output and use the same old regexes as its default behaviour is to be a glorified `cat` (but being able to use the --since and --until flags instead of matching date ranges by regexes makes it much better than `cat` for me).