| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mjb 1269 days ago

The universal scalability law has this term κp(p-1), which suggests that interactions are O(p^2) for systems of p components, and those interactions happen at some relative frequency κ. Right?

So why might we expect coherency to cost O(p^2)? It sure does if the data that we're trying to keep coherent is everywhere, but that's not true in typical sharded systems. It also does if we're trying to do something like atomic commitment across all nodes, but again it's not clear that's what databases actually do. It also raises the question of how we calculate κ, and how the work we're trying to get done translates into a particular value of κ.

As a conceptual model, I quite like the USL. But it doesn't seem as universal as it claims, and I haven't read anything that helps with parameter selection.

So instead we can take a step back and pick another parameter (call it ⍺), which is a random variable whose distribution is the number of shards that a database needs to coordinate over to meet its consistency and isolation goals. Then, for N requests, total work done is proportional to E[⍺] * N. Why might we believe E[⍺] is O(N) too? It could be if we're trying to be serializable and most transactions go 'everywhere'. On the other hand, with key-value accesses or weak consistency E[⍺] could be O(1). With index structures, it could be O(log N). Or whatever.

Anyway, I'm not sure that makes sense, but it does seem like the USL makes some untested assumptions about the ways systems coordinate, which makes it less useful.