|
|
|
|
|
by l-p
1722 days ago
|
|
I've never heard a single good thing about Azure, and now that I have to work
with it I understand why. The most egregious thing (for now) was an Auto Scale failure. For an entire
business day the autoscaler failed to spin up VMs which effectively killed our
product for the day. We could not scale manually either. No logs, no errors,
absolutely no idea what happened. Support tried to convice us for an entire
month that it was our own fault for misconfiguring the autoscaler by quoting
the documentation that was saying another entirely different thing.
After insisting for a month the issue was finally escalated to someone who
could read the logs and immediately see what went wrong: they had run out of
VMs and could not scale up. So we have : 1. Unreliable services, breach of SLA. 2. Zero ability to anticipate, prevent, debug, or ensure the problem will not happen again. 3. Incompetent support lacking basic reading abilities trying to gaslight me. 4. An obvious issue that should have raised alarms long before it arose. Thanks Azure. This week I spend half a day trying to understand why Azure would not create an
Application Gateway using a configuration that was identical to another one we
had already (do _not_ attempt Azure without Terraform).
Turns out it was another outage with absolutely no way to know it.
Best part is the official fix: "Please, retry in case of failure during
the deployment." It takes 30 minutes to create said resource before it enters a
failed state. XKCD 303 applies. |
|