| I sort of had to chuckle at the 20k IOPS AWS instance, given even a consumer $100-200 NVMe gives ~1million+ IOPS these days. I suspect now we have PCIe 5.0 NVMes this will go up to I always do wonder how much "arbitrary" cloud limits on things like this cause so many issues. I'm sure that async IO is very helpful anyway, but I bet on a 1million IOPS NVMe it is nowhere near as important. We're effectively optimising critical infrastructure tech for ~2010 hardware because that's when big cloud got going and there has been so few price reductions on things since then vs the underlying hardware costs. Obviously a consumer NVMe is not "enterprise" but my point is we are 3+ orders of magnitude off performance on cheap consumer hardware vs very expensive 'enterprise' AWS/big cloud costs. |
The IOPS figure usually hides the fact that it is not a single IOP that is really fast, but a collection of them.
More IOPS generally is done best by reducing latency of a single operation but the average latency is what actually contributes to the "fast query" experience. Because a lot of the next IO is branchy from the last one (like an index or filter lookup).
As more and more disks to CPU connectivity goes over the network, we can really deliver a large IOPS even when we have very high latencies (by spreading the data across hundreds of SSDs and routing it fast), because with the network storage we pay a huge latency cost for durability of the data simply because of location diversification.
Every foot is a nanosecond, approximately.
That the tradeoff is worth it, because you don't need clusters to deal with a bad CPU or two. Stop & start, to fix memory/cpu errors.
The AWS model pushes the latency problem to the customer and we see it in the IOPS measurements, but it is really the latency x queue depth we're seeing not the hardware capacity.