|
|
|
|
|
by throwaway892238
1480 days ago
|
|
I think this myth exists because Google was (is?) famously obsessed with SWE. But if you actually read the SRE books and look at the actual discipline of SRE ("what's the difference between SWE and SRE?"), SRE is quite blatantly just operations management. The website is a power plant, and the SRE runs the power plant. You don't build parts to run a power plant, you use software (as in manipulate/control/operate) to run it. You act quickly when the numbers go out of line, you write reports and control how much power is going in and out, respond to surges and dips, etc. For whatever reason, Google decided to tell people that the same person who's building the klaxon and the concrete wall and the pipes for the power plant, and the person who's operating the power plant, are one in the same. But that's clearly bunk. Building a part and running a system are completely different disciplines, and anyone who does both will only be half good at both. Humans are shit at multitasking and there's few true polymaths out there. Show me a master programmer and I'll show you an amateur woodworker. I also don't believe software engineering principles will help you reduce operational complexity. If anything, software engineering tends to either make things either inefficient or subtly complicated. Reducing operational complexity comes from the discipline of operations, which isn't engineering. Non-tech companies have known about these distinctions for like a hundred years. Deming applied scientific rigor and analysis to come up with better practices, but he didn't have to design any widgets to do it. |
|
Depending on the team, SREs can absolutely involved with "building the system", especially the klaxon ;) Examples include designing and implementating metrics used to make make decisions in business logic and or exposed to customers/users, writing routing components like mixers and proxies, developing data pipelines, etc. At Google many SRE teams build and run entire multi-tenant systems with no pure SWEs involved at all.
Healthy SRE teams should be spending 20% of their time on operations. On my team its actually the devs who do most of the operations work. They take the pager during business hours and we route most maintenance tickets to them.