|
|
|
|
|
by gaius
3496 days ago
|
|
If you need 1 engineer available 24/7, you need 3 teams of 3, one each in APAC, AMER and EMEA, at minimum. You can't sustain running with less, otherwise people can't ever take vacations or get sick... if you need 2 you need at least 12 and so on. |
|
You need one team of three for 24/7 NOC coverage, even for a worldwide customer base. Preferably four or more but three will do for a while.
Tickets are automatically generated from alerts for anything from unusual web server logs to disks filling up to network bandwidth spikes to slow SQL queries, etc. The NOC staffer attempts to resolve problems on their own, and identifies and calls the first point of contact for specific issues as necessary, following the appropriate escalation procedure.
As people get woken up in the middle of the night for their shit breaking, after a few weeks, suddenly regular problems either go away or standard procedures for fixes are documented. In a few months a staff of three can handle 90% or more of the tickets that are automatically generated.
We were doing this over 10 years ago with about 2000 servers getting traffic from 2,000 to 150,000 hits per second on Perl web apps. Honestly, nobody does anything efficiently anymore.