If you need 1 engineer available 24/7, you need 3 teams of 3, one each in APAC, AMER and EMEA, at minimum. You can't sustain running with less, otherwise people can't ever take vacations or get sick... if you need 2 you need at least 12 and so on.
You need one team of three for 24/7 NOC coverage, even for a worldwide customer base. Preferably four or more but three will do for a while.
Tickets are automatically generated from alerts for anything from unusual web server logs to disks filling up to network bandwidth spikes to slow SQL queries, etc. The NOC staffer attempts to resolve problems on their own, and identifies and calls the first point of contact for specific issues as necessary, following the appropriate escalation procedure.
As people get woken up in the middle of the night for their shit breaking, after a few weeks, suddenly regular problems either go away or standard procedures for fixes are documented. In a few months a staff of three can handle 90% or more of the tickets that are automatically generated.
We were doing this over 10 years ago with about 2000 servers getting traffic from 2,000 to 150,000 hits per second on Perl web apps. Honestly, nobody does anything efficiently anymore.
The "other people" are just people in your company. They aren't on call, but since they own the thing that's broken they get called, it's standard escalation.
Two people on shift per weekday, three on weekends, and a non-NOC staff member works a shift on a weekend so they can cover sick time and vacation people. It's not that hard.
In the United States you can work pretty much indefinitely as long as there aren't specific regulations against it like for truck drivers.
Part time NOC gets six to twelve hour shifts as needed, and you don't need them 24 hours, you only need them during non office hours.
Weekends are the only time they're there 24 hours, and usually one engineer, developer, devops, *admin, manager, etc rotates out a shift so they have to be familiar with how the system works in case someone gets sick or takes vacation.
I once visited a Ford factory in England where the manager told us the night shift workers were paid very well (more than London front office in their first two years at Goldman Sachs) because their life expectancy might be shortened by as much as 20 years.
When I was growing up, there was a 24/7 coal-mine near-by and everyone was on two-week rotations for their shift. Which (naturally) sucked. Eternal jet-lag. I thought that letting people simple pick their preferred, and stick to it would solve the suckiness...
They were acquired for 19 billion dollars, so they were very obviously doing things right. Your condescending snark seems misplaced.
Maybe the fact that WhatsApp was colossally successful, more so than nearly any other startup -- this was the largest ever acquisition of a venture-backed company -- should get you to update your priors on what kind of scaling practice is a "good idea" and what isn't.
If you don't mind, I'd rather give a different view of WhatsApp and on-call.
It quickly turns out that WhatsApp could have a nice exit, so it made sense for some employees to sacrifice a bit of work-life balance, instead of looking for another job.
Your mileage may vary depending on "How many direct phone calls from customers does your company get when the service is unavailable for 10 minutes? and how much money is lost?" We don't have this data on WhatsApp.
"You seem pretty sharp. I airways wanted to date an app developer! You seem pretty trustworthy... Let me tell you about an amazing opportunity/app idea..."
Rules of thumb: If a business has to operate 24/7 worldwide (and maybe even have office in multiple locations), it's gonna need over 100 employees.
There is no sane way around that. You don't want to be in a business with less than that many, trust me ;)