|
|
|
|
|
by hdjrudni
149 days ago
|
|
> though it probably goes down less than something you self-manage, unless you're a full-time SRE with the experience to back it. I wonder how true that is. This went down because of a bad update, which is probably like 99.99% of outages. The other 0.01% is cosmic rays causing hardware failures. My server was up for 3.5 years with no outages because I just didn't touch it. I had to take it offline a couple days ago to move it which made me a little sad. Took a snapshot and moved it to a new droplet, brought it back up as-is and it's running great again. Anyway, emergencies are less emergy if things go down while you're upgrading and shuffling things around yourself. You expect hiccups if you're the one causing the hiccups. It's when someone else is tinkering on the other side of the country/planet and blows something up that suddenly you have an emergency. |
|
Problem #1 keeping OS current. Chances are you run an outdated OS with some RCE vulnerabilities.
Problem #2 setup is hard to scale organizationally. How to give access to the server to other people? How to monitor what they do? How to replicate server setup across teams and keep it in sync? So on and so forth.
In an org. something always change, and you have to touch servers as a result.