Hacker News new | ask | show | jobs
by tom-_- 1512 days ago
Site Reliability/Infrastructure/Production Engineer typically have three focusses:

Managing and scaling "software infrastructure" (Kubernetes, cloud services, databases/caches, job/message brokers, etc). SREs tend to have expertise in these systems that generalist SWEs do not.

Incident response. Many production incidents are not caused by a bug in application code, but due to some cascading failure of some piece of the software infrastructure, often due to unexpected load, performance regressions or network/hardware outages. Because SREs have a broader picture of how the various pieces of infra fit together, they're best suited to start root cause analysis and determine whether it's an infra/code issue or some combination of the two.

Devops/developer productivity. Many SREs work on build/release systems, internal tools and enforcing best practices.