| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cakoose 1639 days ago

Even with an index, counting the number of rows is an O(N) operation in most DBs.

Why?

1. DBs typically store data in some flavor of b-tree. Determining the number of elements in a sub-range of a b-tree (or any "standard" search tree) is O(N).

2. You can improve that to O(log N) if each b-tree node stores the size of its subtree. But this also increases the cost of add/remove operations.

3. And things get more complicated for DBs that support concurrent transactions, since the count is different for each transaction, depending on when it began running. You'd need something more complicated than just a size-of-subtree integer per node.

It's all doable, but isn't trivial and increases the cost of all add/remove operations.

There have been times where I really wished it existed, though... :-)