Hacker News new | ask | show | jobs
by coreysa 4227 days ago
Thanks. We are continuing to investigate this and driving needed improvements in our process and technology to avoid similar issues in the future.
1 comments

The last two times there was a big issue the same thing happened with the status dashboard (it became inaccessible). I remember the same issue when the certs expired 1,5 years ago. I really like Microsoft and was convinced "you" would somehow isolate the dashboard and host it separately, but it turns out I was wrong. Do you happen to know the reasons for hosting the status dashboard inside of Azure? It seems so counter-intuitive to me. Or is it actually hosted externally but died due to the load when the issue started to appear?

The OP mentions that Microsoft representatives gave info via public forums. When the issue appeared I looked in different places trying to find info, but only I found was a statement saying that We are aware of issues. I looked at Azure twitter/blog, ScottGu twitter/blog, Hanselmans, MSDN forums. I also tried this forum and reddit. Do you know where I should have gone to receive details?

Thanks. The communications and the service health dashboard are two areas that we are creating improvement plans from the learning of this event. For the dashboard, we do expect it to continue to run even through outages like this one, but we did encounter an issue with our fallback mechanism that we need to understand more deeply.

For general communications, we did most of our early communication on the event using twitter, announcing the incident and giving updates. We need to build a more formal multi-pronged approach to communicating, including faster responses in the MSDN forums and here in HN to make sure we are reaching as many of our customers and partners as possible. Thanks again for the feedback!!