|
|
|
|
|
by jedberg
795 days ago
|
|
> if you don't have metrics for cpu/memory/storage how do you know when to scale the app When the business metrics start to fail. You don't need constant metics on storage, you can poll it every so often. If your app is constrained by CPU or RAM, then the business metrics will reflect that, and then you can turn on collection of those metrics to identify the problem. > i feel like you have never touched servers/backend in anything more than simple projects (or at all). with full storage/memory there could be an issue that you won't be able to ssh to the server, so it speaks about your knowledge in this matter. I ran all of ops for reddit for four years and headed up SRE at Netflix, so I have some experience in large scale systems. Not that it should matter. |
|
After having annoyed how many users and lost how much revenue? Having metrics to identify brewing problems before issues start to arise (be they on arriving CPU, memory, disk, network constraints or increasing network latency which will soon but not yet show up in the business metrics) is valuable.
> I ran all of ops for reddit for four years and headed up SRE at Netflix, so I have some experience in large scale systems. Not that it should matter.
I have a hard time believing at either of those it was acceptable to have a problem ongoing for days without any idea what's happening because logs and metrics weren't enabled in the first place.