Hacker News new | ask | show | jobs
by canucker2016 582 days ago
from the first Exchange Reply-All email storm, a dev who worked on the Exchange server, https://techcommunity.microsoft.com/blog/exchange/me-too/610...:

    An Exchange email message actually has TWO recipient lists – there’s the recipient list that the user sees in the To: line on their email message. This is called the P2 recipient list. This is the recipient list that the user typed in. There’s also a SECOND recipient list, called the P1 recipient list that contains the list of ACTUAL recipients of the message. The P1 recipient list is totally hidden from the user, it's used by the MTA to route email messages to the correct destination server.

    Internally, the P1 list is kept as the original recipient list, plus all of the users on the destination servers.  As a result, the P1 list is significantly larger than the P2 list.

    For the sake of argument, let’s assume that 10% of the recipients on each message (130) are on each server. So each message had 100 recipients in the P1 header, plus the original DL. Assuming 100 bytes per recipient email address, this bloats each email message by 13K. And this assumes that there are 0 bytes in the message – just the headers involve 13K.

    So those 15,000,000 email messages collectively consumed 195,000,000,000 bytes of bandwidth. Yes, 195 gigabytes of bandwidth bouncing around between the email servers.

    ...


    So what did we do to fix it? Well, the first thing that we did was to fix the MTA. And we tried to scrub the MTA’s message queues. This helped a lot, but there were still millions of copies of this message floating around the system.

    To prevent anything like this happening in the future, we added a message recipient limit to Exchange – the server now has the ability to enforce a site-wide limit on the number of recipients in a single email message, which neatly prevents this from being a problem in the future.

It didn't fix the problem completely from what I recall, there were smaller versions of Bedlam at MSFT. I've heard that some branch of the US Dept. of Defense created their own Bedlam storm a few years back. So they had to layer in a few more guardrails to prevent another reply-all from getting out of control.

Here's one reference, https://www.theregister.com/2023/02/14/us_army_reply_all_sto..., though I thought they had one back in the 2010s.

1 comments

Another storm hit MSFT around the start of the pandemic, https://forums.theregister.com/forum/all/2020/03/26/microsof...
Maybe the last fix for the reply-all email storm problem (at least for Exchange servers)?

see https://www.theverge.com/2020/5/10/21253627/microsoft-reply-...