|
|
|
|
|
by ryao
533 days ago
|
|
Your article led me to wonder if our b-trees would be faster if I switched the intra node operations to use Eytzinger ordered arrays: https://github.com/openzfs/zfs/commit/677c6f8457943fe5b56d7a... There are two ways to look at this Big O wise. One is that insertions and deletions would be asymptomatically faster since memmove() is a linear operation while bubble up/down are logarithmic operations. Look ups would not be any different asymptotically, but the constant factor might improve from being able to do prefetch. The other way is that the N is bounded, such that it is all O(1) and the difference is how big the constant factor is. I imagine I could implement it and benchmark it. However, my intuition is that the end result have lookups be marginally faster to the point of splitting hairs while insertions and deletions would be slower. While memmove() is a technically a linear time operation, it is a sequential operation that has a very low constant factor. The bubble up and bubble down operations needed to do insertions and deletions in a Eytzinger ordered array are technically random access, which has a higher constant factor. At some point, the Eytzinger ordered array operations should win, but that point is likely well beyond the size of a b-tree node. My reason for saying this is to say that Big O notation still matters, but understanding when the constant factor is significant is important. |
|