Hacker News new | ask | show | jobs
by rcarmo 855 days ago
As someone who works with Azure daily, I am amazed not just at the initial consultant's conclusion (that is, alas, typical of folk who do not understand database engines), but also to your struggle with NVMe storage (I have some pretty large SQLite databases on my personal projects).

You should not have needed an Ebsv5 (memory-optimised) instance. For that kind of thing, you should only have needed a D-series VM with a premium storage data disk (or, if you wanted a hypervisor-adjacent, very low latency volume, a temp volume in another SKU).

Anyway, many people fail to understand that Azure Storage works more like a SAN than a directly attached disk--when you attach a disk volume to the VM, you are actually attaching a _replica set_ of that storage that is at least three-way replicated and distributed across the datacenter to avoid data loss. You get RAID for free, if you will.

That is inherently slower than a hypervisor-adjacent (i.e., on-board) volume.

2 comments

> Anyway, many people fail to understand that Azure Storage works more like a SAN than a directly attached disk--when you attach a disk volume to the VM, you are actually attaching a _replica set_ of that storage that is at least three-way replicated and distributed across the datacenter to avoid data loss. You get RAID for free, if you will.

I've said this a bit more sarcastically elsewhere in this thread, but basically, why would you expect people to understand this? Cloud is sold as abstracting away hardware details and giving performance SLAs billed by the hour (or minute, second, whatever). If you need to know significant details of their implementation, then you're getting to the point where you might as well buy your own hardware and save a bunch of money (which seems to be gaining some steam in a minor but noticeable cloud repatriation movement).

Well, in short, people need to understand that cloud is not their computer. It is resource allocation with underlying assumptions around availability, redundancy and performance at a scale well beyond what they would experience in their own datacenter.

And they absolutely must understand this to avoid mis-designing things. Failure to do so is just bad engineering, and a LOT of time is spent educating customers on these differences.

A case in point that aligns with is that I used to work with Hadoop clusters, where you would use data replication for both redundancy and distributed processing. Moving Hadoop to Azure and maintaining conventional design rules (i.e., tripling the amount of disks) is the wrong way do do things, because it isn't required neither for redundancy nor for performance (they are both catered for by the storage resources).

(Of course there are better solutions than Hadoop these days - Spark being one that is very nice from a cloud resource perspective - but many people have nine times the storage they need allocated in their cloud Hadoop clusters because of lack of understanding...)

I would think that lifting and shifting a Hadoop setup into the cloud would be considered an anti-pattern anyway; typically you would be told to find a managed, cloud-native solution.
You would be surprised at what corporate thinking and procurement departments actually think is best.
The cloud is also being sold as “don’t worry about data loss”.

To actually deliver on that promise while maintaining abstraction of just “dump your data on C:/ as you are used to”, there are compromises in performance that need to be taken. This is one of the biggest pitfalls of the cloud if you care more about performance than resiliency. Finding disks that don’t have such guarantees is still possible, just be aware of it.

I may have the "Ebsv5" series code incorrect. I'd look it up, but I don't have access to the subscription any longer.

What I chose ultimately was definitely "nVME attached" and definitely pricey. The "hypervisor-adjacent, very low latency volume" was not an obvious choice.

The best performing configuration did come from me--the db admin learning Azure on the fly--and not the four Azure architects nor the half dozen consultants with Azure credentials brought onto the project.

Ebsv5 and Ebdsv5 somewhat uniquely provide the highest possible storage performance right now in Azure, partly because they support NVMe controllers instead of SCSI.

However, the disks are still remote replicas sets as someone else mentioned. They’re not flash drives plugged into the host, despite appearances.

Something to try is (specifically) the Ebdsv5 series with the ‘d’ meaning it has local SSD cache and temp disks. Configure Postgres to use the temp disk for its scratch space and turn on read/write caching for the data disks.

You should see better performance, but still not as good as a laptop… that will have to wait for the v6 generation of VMs.