Hacker News new | ask | show | jobs
by dsXLII 1003 days ago
Volume. 1GB of data per day is rounding error. If you have tens of thousands of servers, each generating hundreds of gigabytes of data per day, tail -f and grep don't scale especially well.
3 comments

And I bet a hang glider can't fly from New York to Paris, either! The nerve!

Recall that the poster said this was for a small startup. If you're Google, by all means, use Google logging tools. If you aren't, then solve the problem you have, not the problem your résumé needs.

The guy asked

> Twenty years later, I still can't fathom why we're spending so much money on Splunk, DataDog a the like.

And the poster above answered that question

They scale perfectly fine, as long as you filter locally before aggregating. Lo and behold:

  mkdir -p /tmp/ssh_output
  while read ssh_host; do
      ssh "$ssh_host" grep 'keyword' /var/my/app.log > "/tmp/ssh_output/${ssh_host}.log" &
  done < ssh_hosts.txt
  wait
  cat /tmp/ssh_output/*.log
  rm -rf /tmp/ssh_output
Tweak as needed. Truncation of results and real-time tailing are left as an exercise to the reader.
100GB of logs per day? what kind of applications are that chatty?
Yeah, the solution here is to get rid of 98% of the logging.
Probably Java/JVM... Never seen something where all kinds of libraries log more.
Log level configuration is a cheap solution in this case.