The "pro" that I am is sysadmin, and I'm asserting that evaluating all three is a waste of time.
Splunk, given its cost and complexity, is almost never right for startups.
Non-ng syslog is, on the other hand, so simplistic that it's not worth the effort of fancy configuration. Is there some kind of compelling advantage that I've been overlooking?
I never quite understood the conceit that every environment is a precious-and-unique snowflake requiring careful evaluation of any given tool.
Turns out I am a sysadmin as well, and I'm asserting that each has various strengths. I have used syslog-ng as long ago as 2001, so I have some experience with it. Today I would recommend rsyslog. It is the default logger in Ubuntu 10.04 LTS and Fedora is also transitioning to it:
Further, I think that RELP and on-demand disk spooling of messages are compelling features. Its performance and reliability are good enough to feed your web-server access logs through.
I wouldn't overlook rsyslog, but I'm also not saying "just use it" because syslog-ng is certainly worth evaluating as well.
This more in-depth discussion has more value. Thank you.
I think that RELP and on-demand disk spooling of messages are compelling features
I think we're coming at the question from different perspectives. One of my primary goals is to avoid wasting my time. Since I've already evaluated and experimentally proven syslog-ng, switching means a large time investment.
As such, features like REPL and, arguably misfeatures[1], like disk spooling, fail to compel such an investment.
Once rsyslog has matured, something that I expect will be accelerated by its inclusion in major distros, it may be a no-brainer.
For my "money," there are far more interesting and productive problems to work on than logging, which is why I do give the "just use it" advice.
Turns out I am a sysadmin as well
By choice or necessity? Just curiosity on my part.
[1] I have yet to encounter an environment of non-trivial size where the risk of losing logging outweighs the risk of disk filling up and/or performance degradation from additional contentious I/O. For me, it's a killer feature of centralized logging: elimination of a particular source of failure/degradation.
"Splunk, given its cost and complexity, is almost never right for startups."
Many, many startups would disagree with you. If cost is a factor, we're starting a program for startups. Feel free to ping me for more info.
We'll also have a new developer license for our next release, which will make it easier for cash-strapped developers to use Splunk.
And finally, I would just point out that using rsyslog, syslog, and syslog-ng do not preclude you from using something like Splunk. Many of our users use Splunk with a log manager, such as syslog-NG, because they like our analytics engine and reporting tools. YMMV.
Now back to your regularly scheduled discussion :)
Many, many startups would disagree with you. If cost is a factor, we're starting a program for startups. Feel free to ping me for more info.
Thanks for the reply. Admittedly, my statement is better qualified by "early" and "web scale."
Cost is very much a factor, as $500/mo buys a lot of ramen, and, even for a slightly larger company, is still a noticeable expense, for something where the value is unknown[1] ahead of time.
Even just the nomenclature "enterprise" implies pricing that's optimized to pay for the sales process or other hand-holding, rather than something technical. Such inference is further bolstered by the fact that a full price list isn't published.
If cost is a factor, we're starting a program for startups. Feel free to ping me for more info.
The companies I work for, in general, have neither the free time nor the inclination to pay for such a sales process. The program I'd want to see is one where I can order online with a credit card.
Time is a portion of cost, not just cash.
[1] Perhaps even unknowable, since, if the output isn't consumed, it's all potential value.
I'm a huge Splunk user and I agree with it's price tag it would make it very difficult to justify the costs against the gains when you're operating a small and tight ship. However, I would be very interested in the tools that they use to analyse their data. Obviously they could use a series of grep/awk scripts to pull out the data in key value pairs but what do they do with it after that?
Splunk, given its cost and complexity, is almost never right for startups.
Non-ng syslog is, on the other hand, so simplistic that it's not worth the effort of fancy configuration. Is there some kind of compelling advantage that I've been overlooking?
I never quite understood the conceit that every environment is a precious-and-unique snowflake requiring careful evaluation of any given tool.