|
|
|
|
|
by tom-_-
1512 days ago
|
|
Site Reliability/Infrastructure/Production Engineer typically have three focusses: Managing and scaling "software infrastructure" (Kubernetes, cloud services, databases/caches, job/message brokers, etc). SREs tend to have expertise in these systems that generalist SWEs do not. Incident response. Many production incidents are not caused by a bug in application code, but due to some cascading failure of some piece of the software infrastructure, often due to unexpected load, performance regressions or network/hardware outages. Because SREs have a broader picture of how the various pieces of infra fit together, they're best suited to start root cause analysis and determine whether it's an infra/code issue or some combination of the two. Devops/developer productivity. Many SREs work on build/release systems, internal tools and enforcing best practices. |
|