Also curious if FB outages affect global internet traffic? Are there any relative stats? (E.g. if google to go down, it would have an economic cost comparable to a natural disaster)
There's several anecdotes internally at Facebook that get told to new starters about some instances of when facebook.com was down. One involves masses of people calling the police to report that facebook is offline (wat, why?), another involves so many people flocking to twitter to complain that fb is down that that crowd then knocked twitter offline ("How do you know if facebook is down? Check whether twitter is online" sort of thing).
I don't feel comfortable sharing the exact figures, but during my time there the cost of a few minutes of downtime on an ads-adjacent message queue cluster directly turned into millions in lost revenue, so it's safe to say that losing the capability to _serve_ ads (as opposed to charging for serving them) would have a pretty damn high economic cost to those advertisers. As Workplace gains traction, downtime there will creep into the "cataclysmic" side of the economic costs scale.
Did you count the lost revenue as revenue per minute times the number of minutes the site was unavailable?
If so, that's actually incorrect when calculating revenue loss. There will be actually be an uptick in traffic after the outage, either through people postponing their intended visit or checking if the site is down for them after it comes back.
I know that at LinkedIn we broke part of the advertisement campaign creation flow from downtime in our service for a short time, and the impact was negligible since ads were still being served, and new ad creation would be just be retried later by advertisers.
Now if you just lost the capacity to serve ads, there might be an impact, but since a lot of advertisements are display ads (to raise brand awareness over an immediate call to action), it's not clear whether or not it's a simple minutes down times impact per minute calculation.
Revenue/minute * duration actually worked here because what went down was the moral equivalent of a Kafka cluster — ads were still being served, but we couldn’t log clicks/impressions over that period of time
I don't feel comfortable sharing the exact figures, but during my time there the cost of a few minutes of downtime on an ads-adjacent message queue cluster directly turned into millions in lost revenue, so it's safe to say that losing the capability to _serve_ ads (as opposed to charging for serving them) would have a pretty damn high economic cost to those advertisers. As Workplace gains traction, downtime there will creep into the "cataclysmic" side of the economic costs scale.