Hacker News new | ask | show | jobs
by Jedd 878 days ago
How would that work? You'd get an in-app message in 40 minutes saying there's a problem 40 minutes ago?

Anyway, this is what status pages are for: https://admin.microsoft.com/servicestatus (currently showing a whole lotta information about problems with Microsoft Teams).

3 comments

Status pages are nerd tools. Anyone that says otherwise is living in a bubble. Pretending momentarily that this is not true, you wouldn’t go to a status page if you thought that everything was fine. Not sure what is warranting such a kneejerk defence…

The implication is that the infra required to notify of an outage is lesser/different than what’s required to…run Teams. Publish it on DNS!

If you think everything is fine, presumably that's because every metric you have available is indicating fine-ness. That's probably sufficient for most users. There'll be some edge cases, but the brief write-up in some obscure IT online journal should suffice to restore anyone's slightly damaged reputation regarding a torpid non-committal response.

I'd also assume if you're using Microsoft Teams it's because someone else has determined that you shall use it. I expect that any semi-informed, mildly IT literate Microsoft Teams user has a starting position of 'low expectations'.

Anyway, parent was suggesting an in-app message to indicate some issues, so my initial question stands - how do you expediently send an instant message to a user to advise them that your instant messaging app is having a bad day?

This generation will reinvent the watchdog..
Use message sent date for one, and check how late messages are arriving. If late, ping a status check endpoint.
As a sibling noted, messages on the receivers end are time-stamped with the receive time, not the send time.

To your specific suggestion, how do you identify if the messages arrived late because of a server-side issue, rather than a client-side, network disconnect, or some other non-infra issue?

Anyway, while I'm sure there's myriad ways that the authors of this app could have engineered it to be more resilient, self-reporting, etc - it's clear that they did not.

That page was green for most of yesterday lol. I stopped checking around 2pm SIX hours after people were noticing problems.