Hacker News new | ask | show | jobs
by rzmnzm 2377 days ago
Azure has a pretty nice tech stack in my experience, what issues are you experiencing with it?
1 comments

There's a laundry list.

The resource template format is inhuman gibberish designed for machines. It does idiotic things like manually hard-coding enums in the template to support drop-downs. This means templates go "stale" very quickly and are super fiddly to update to support things like new VM sizes.

You cannot export resource groups bigger than a certain size, so if some idiot in your org makes them too big, there's no backup even in this minimalistic sense.

There is no "backup" for anything but a handful of object types (Web Apps, VM disks, Secret Vaults, and that's about it). If you want to protect your data, you'd better have an on-premise copy and 100% automated deployment capability.

Speaking of automated deployment capability, 99.9% of the Azure docs mention manual deployment or procedural (PowerShell) deployment. Declarative is clearly an afterthought, and mentioned only in passing. Because of this, ~95% of Azure deployments I've seen in the wild were entirely hand built and very fragile. If some malicious admin ran Remove-AzResource on everything, those orgs may as well declare bankruptcy the next day.

Despite this fragility, Azure sales reps are pushing-pushing-pushing for their customers to go "all in" on cloud, be "cloud first", and to basically "lift and shift" their data centres. You gotta break a few eggs to meet your quarterly targets, am I right? Right? Guys?

Speaking of the PowerShell, it's 99% auto-generated from the JSON API binding definitions, and about 5% of it is documented even in the bare minimum sense.

IPv6 "support" is hilarious. They very generously provide a /124 static public subnet prefix. So many addresses! A whole 14 of them! Woo! It's the future now! No need for NAT! A routable address for every endpoint! Let me get right on that, soon as I figure out the fiddly scripting needed to allocate addresses from hundreds of tiny pools. Much fun.

If you delete a DNS Zone by accident, you can't properly recover it within 48 hours because they randomly pick one of 10 name server pools. Hence, your NS bindings at the registrar will point at the wrong servers and even if you update this, there's an inevitable propagation delay. I am aware of workarounds for this like Resource Group Pinning, but only because we jumped up and down and forced support to admit that it's a problem. This little "surprise" is still undocumented.

Speaking of DNS, until we forced Microsoft to fix it, the only way to back up Azure DNS records (az network dns zone export) would corrupt CNAMES and wouldn't round-trip.

Azure DNS uses an idiotic Zone->RecordSet->Record hierarchical structure, which makes small incremental changes hugely fiddly in scripting. You have to download the existing RecordSet, modify it, and then send it back with ETags intact. You can't treat each record as independent rows in a table, even though they effectively are.

The Azure DNS servers don't send "Additional" records (e.g.: the matching A records for the target of a CNAME record), which means that a) it's slower for clients, and b) they can charge you more. They have zero incentive to fix this, because it literally doubles their revenue for alias records of all types.

DNS Metrics are collected every 2 hours, but the graph displays only daily or hourly intervals, so you either get no detail at all, or a sawtooth graph that gives you no useful feedback at all at best, or is panic inducing at worst. Imagine making a small change, glancing at the graph, and seeing it hit zero. Then... staying at zero for an hour. A joyous time, for sure.

I could go on and on.

Azure has some neat stuff, but they're moving fast and breaking things, and half their products are basically MVP garbage authored by the lowest-bidding Indian outsourced teams.

To change the vnet of a VM, you still need to deallocate it and restore it into the new vnet.

Anything they don't offer directly, they offer a pile of canned VMs authored by Bitnami, who adds their own layer of management scripts to them, which (in our case) corrupted the filesystem on restarting from the great Shellshock cloud reboot.

Azure is, in part, an ad-hoc cloud offering made up of services offered by mutually incompatible business units. Adding analytics in the dashboard has a 50% chance of routing you off of Azure to a separate MS monitoring/analytics solution (OMS) requiring its own licence.

Their linux management agents have repeatedly locked up our VMs.

Because of the way Cosmo pricing works, we were billed ~$5,000 over 3 months for 50MBs of data that was basically unused at the time--because you pay for "reserved network units" on a per collection basis (as in a mongo collection), and the floor pricing drove it that high without any utilization.

As parent said, it's a bunch of MVP garbage, to which I'd add: to tick a bunch of checkboxes for Fortune 100s to sign 8-9 figure deals with promises of volume pricing. Operationally it's flea market.

Just to add to my previous post... this is stuff that has irritated me in just the last few minutes:

* When selecting a VM size in the new-VM wizard, it silently resets it to "DS1 v2" a bit later. You have to go back to the first tab and re-select the size you really wanted.

* This then promptly resets my "Already have a Windows Server license?" setting so then I have reapply that too.

* The default VM template suggested is Windows 2016, not Windows 2019.

* The selection box for the region keeps resetting to US East, even though that's the worst location for me. There is no default location setting in the portal that I can set for myself. This is an Enterpise-wide setting only.

As above, I've shared these points as well with teams who own these areas. Thanks again!
Thanks very much for the product feedback. I've routed the points of feedback to the teams who own the components involved to take a look.
I do appreciate the fact that the Azure team is here, noting feedback and routing it. I had a longstanding complaint about how Cosmos' mongo API was incomplete, and my complaints about it here were noticed and it did eventually get fixed. So, it may be slow, but the Azure team isn't unresponsive.
Thanks for providing us your specific feedback. I am part of the ARM Deployments Team. We are looking to support export for resource groups(RG) with more than 200 resources. One of the top customer requests with regards to export was the ability to multi-select individual resources in a RG to be exported which we added a few month ago.

We have renewed our focus on ARM Template deployments(our infra as code offering) and have a dedicated team working on addressing the pain points described in the thread as well as other deployment issues. We are not going to solve all the problems instantaneously but we are consciously working on it.

Would love to get your input on other ARM Deployment related feedback and areas of improvement. If you haven't had a chance you can check out some of the deployment improvements we announced recently in this video: https://www.youtube.com/watch?v=3D-JIKShrws

Jesus wept.

* Comments in JSON -- the standard explicitly bans this, making ARM templates no longer JSON, and hence not processable by standards-compliant tools. Including, hilariously, PowerShell. ConvertFrom-Json barfs, as it should. The correct approach is to have an explicitly non-JSON format like TOML or YAML that allows comments.

* "ARM has this whole time been secretly case insensitive" -- Someone clearly hasn't been notified about the harmful consequences of the Robustness Principle. [1]

* "People say that nobody authors ARM templates from scratch" -- Well, yes. This is because the template format is barely more than serialised RPC and wasn't developed like a proper programming language.

* The tab completion suggests strings that are too long for the dropdown UI and are cut off, making this feature unusable for a lot of ARM fields, such as the resource "type" values, which have long prefixes.

* "This shows as something different in the Portal" -- You have to memorise internal project names, aliases, or other difficult-to-discover names for things instead of simply using the display names. This is not user-friendly.

* A general issue I've seen with JSON-style configuration files is that things that should be side-by-side (e.g.: a bunch of related VM names) are scattered and separated by a lot of repeated "ceremony" text. Take a look at Altova XML Spy, which has a hierarchical grid view that moves such text fields into adjacent columns and rows, much like Excel does. This makes it far easier to spot mistakes or to make bulk changes.

* The "WhatIf" support is hilarious. I updated my Az module yesterday and I don't have that command. In the demo, it did something terrifying and the presenter had to close it. The output looks hideous, not something that looks useful to me.

This is what it boils down to: I don't want color-coded output that is almost-but-not-quite-JSON. I want tabular output. I want to see what happens if I pipe in 1K rows from a spreadsheet into a template. I want ConvertFrom-Json to work. I want to be able to validate things before deploying them. I want to see WhatIf support in the Azure Portal, with the final look. I want more than one tool to be able to process ARM templates. I want ARM templates to default to 2019, not 2015.

Tooling isn't going to fix this.

I want a language designed for humans.

This demo video to me felt like putting lipstick on a pig. Sure, it's better looking, but... still a pig.

[1] https://tools.ietf.org/id/draft-thomson-postel-was-wrong-03....

Thanks again for the feedback and totally agree with you on some of the points you have made. We are exploring some options with regards to the language. Wondering if you would be willing to do a call with us and get your thoughts\feedback. My email is satyavel[at]microsoft[dot]com if you would be willing to chat.
Indeed, Azure to me seems to be focused on achieving success on absolutely anything else besides technical quality and competency. Their strategy seems to be acquiring every certification and qualification under the sun so that they can be the only legally viable option for many large customers.

Kind of sounds like the good old Microsoft that made Windows a stable in nearly every enterprise...

Thanks much for providing specific feedback on areas where Azure could do better. This is really helpful and I've followed up with owners for the components you mentioned. None of these points fall directly into areas that I work on, but I'll make sure teams are aware.

I did want to correct one misconception, which is that we do not outsource Azure product development (we do, from time to time, hire contractors to add force on projects, but these projects are led by FTEs who are involved on a daily basis). There are certainly areas where we strive to target lean MVPs so we can get new products into customers' hands faster and get feedback earlier. I'm sure there will be some sharp edges there and we greatly appreciate you giving these services a try.