| HN Mirror

There are two design factors that most people overlook:

Most people know you can use Z/C-curve encodings to dynamically content address point data. There is a (very useful) generalization to hyper-rectangle types, perfect for content-addressing non-point geometries etc, but those types can't be meaningfully sequentialized at all in big systems. Most non-trivial spatial analytics involve non-point geometries, so being able to sequentialize points has limited utility.

Second, the computational cost of sorting along the curve, assuming you are using only points, is prohibitively high for negligible benefit. Modern storage engines use small shards, which are adaptively re-sharded as needed, and medium-sized pages. For insert, the content-addressing mechanic gets you to a single page; it would be significantly more expensive if you were sorting along the curve. For query, the typical selectivity on a shard is so high due to small adaptive shards, that you are better off treating it as an unsorted vector anyway. In short, much slower inserts and few (if any) query benefits.

As an optimization, it tends to only be applicable in cases where the architecture is significantly suboptimal anyway e.g. the use of gigantic shards. You'd get more benefit by fixing the architecture than trying to optimize poor architecture if at all possible.