|
|
|
|
|
by colanderman
3183 days ago
|
|
Too late to edit now but "100 million" should read "20 billion"… I missed a tree layer. (Assuming a branching factor of 200 and 4 GiB of index cache: 500 thousand inner nodes of 8 KiB each can fit in cache, corresponding to 100 million leaves, which can contain 20 billion entries.) So hash indexes really only begin to show benefits for individual lookups when your data is terabyte-scale, and below that can even be harmful for that use case (if you could otherwise benefit from an index-only lookup). But see ankrgyl's comment (sibling to this one) for better reasons to consider hash indexes. |
|