|
|
|
|
|
by ajross
573 days ago
|
|
Skiplist discussions that don't talk about heap effects are probably incomplete. The big limitation with skiplists for anything that isn't a "mostly-write-once" workload (something that, it should be noted, doesn't get much benefit from the atomic implementation discussed) is that the metadata record is variable sized, which does bad things to heap fragmentation and cache locality (two things you probably care deeply about if you're going so far as to junk your AVL tree for a lockless data structure) and isn't amenable to the slab-style optimizations used elsewhere. I love skiplists for their simplicity, especially compared with the balanced trees against which they compete. But the truth is they're very one-trick-pony data structures and don't bring lot to the table as a general choice. Their home is pretty limited. |
|
For one, a sorted sequence of keys can be written to file in a single pass, and it's easy to interleave with the data itself. For example, Lucene uses multi-level skip lists in compressed posting list files in order to quickly jump to a term and then jump to the lowest matching document ID for that term. Since these files are immutable, the tree is only built once and all the data can be stored in sorted order, which has the added benefit that these files can also be merged in a single pass.