Hacker News new | ask | show | jobs
by l-p 1722 days ago
I've never heard a single good thing about Azure, and now that I have to work with it I understand why.

The most egregious thing (for now) was an Auto Scale failure. For an entire business day the autoscaler failed to spin up VMs which effectively killed our product for the day. We could not scale manually either. No logs, no errors, absolutely no idea what happened. Support tried to convice us for an entire month that it was our own fault for misconfiguring the autoscaler by quoting the documentation that was saying another entirely different thing. After insisting for a month the issue was finally escalated to someone who could read the logs and immediately see what went wrong: they had run out of VMs and could not scale up.

So we have :

1. Unreliable services, breach of SLA.

2. Zero ability to anticipate, prevent, debug, or ensure the problem will not happen again.

3. Incompetent support lacking basic reading abilities trying to gaslight me.

4. An obvious issue that should have raised alarms long before it arose.

Thanks Azure.

This week I spend half a day trying to understand why Azure would not create an Application Gateway using a configuration that was identical to another one we had already (do _not_ attempt Azure without Terraform). Turns out it was another outage with absolutely no way to know it. Best part is the official fix: "Please, retry in case of failure during the deployment." It takes 30 minutes to create said resource before it enters a failed state. XKCD 303 applies.