|
|
|
|
|
by tetha
1287 days ago
|
|
As much as I approve of a skillset to analyze local logs, but after a relatively small scale (10-20 systems), a central decent log aggregation like opensearch or ELK just brings so much value even on 1-3 nodes. It'd be one of the first changes I make to an infrastructure because it's so powerful. And its not just log searching and correlation value. At work, the entire discussion "oh but we need access to all servers because of logs" just died when all logs were accessible via one web interface. I added a log aggregation and suddenly only ops needed access to servers. Designing that thing with accessibility and discoverability in mind is a whone nother topic though. |
|
Throw Loki+Grafana on a single VM somewhere (or run it on your Kubernetes/Nomad/ECS/etc. cluster) and it will get you very far, as long as you plan ahead a bit (most notably, indexing happens at ingestion, so you need to have an idea of what you want from your logs or your queries will be slower).
Or use a SaaS like logz.io, AWS OpenSearch, Datadog, etc. Most support OpenTelemetry now, so switching data ingestion is quite easy (unlike dashboards and alerts).
It makes sense to have centralised logs pretty much as soon as you outgrow the "everything runs on this one box and my DR plan is a prayer" stage, IMO.