|
|
|
|
|
by ghshephard
4055 days ago
|
|
grep definitely breaks down on large systems. I have one environment with approx 5 million nodes - (1e6), and the only way to coherently manage the log updates from them is in binary format. But even still - I like to have the text files as journals of original entry - so I can occasionally do a tail -f incoming.log| egrep -i "somedevice". And having the original files in text format is zero impediment to getting them into handy binary database form. |
|
That said, I'd be curious to know some more of the details of that system actually! If you're aggregating all of those devices together, using something binary in that context definitely makes sense. In fact, if I were in your shoes and tasked with designing some means of solving that problem, I would probably use something like protobuf or capnp to emit those messages since they're well-known and well-understood serialization mechanisms.
Now, that's the integration and aggregation side of this exercise.
On a local node-by-node basis, though, I absolutely agree; having the raw text as journals of original entry for inspection in real time with `tail -f` (or, if you're using multilog, `tail -F`…) would still be incredibly useful.
Going back to Mr. Nagy's article, the space of problems that `tail -f` solves is barely overlapped by the space of problems solved by aggregation. I think he's conflated the two spaces in his article here (and especially in the one previous) whereby he's applied a one-size-fits-all solution to both where it demonstrably does not fit all.