Hacker News new | ask | show | jobs
by katox 5400 days ago
Writing tree structures is always fun. Writing it in an optimized way is a lot of work though. And what's worse one needs to reimplement quite a few things if the problem solved by the particular tree changes.

To avoid that issue Tarjan and Werneck designed self-adjusting trees so the user doesn't have to understand implementation details while still having access to an effective data structure. There is a free implementation (with relevant papers linked) so if you don't want to reinvent the wheel have a look :)

https://github.com/katox/toptree

2 comments

I remember a paper that compared fast implementations of treaps, AVL, and red-black trees, and found that they are approximately equally well suited. rbtrees are best at insertion, AVL at selection, treaps at overall performance (if I recall correctly). I don't have the reference though...
That result surprises me. Red-black trees do their work at insertion time, and AVL trees split the work between insertion and selection. This should make red-black trees faster at selection and AVL trees faster at insertion, and both should do a set of insertions and selections in the same time.

(A red-black tree is a plain-old binary search tree for everything but modifications.)

AVL trees give shorter path lengths. So the resulting structure ends up being better, depending on your exact balance of insertion and selection.
Whaaa? AVL trees don't split the work between insertion and selection. Selections are plain old read-only BST lookups.
I read a paper in the spring that purported to compare, among others, red-black trees and treaps. It found that treaps were better for ordered operations but worse for random operations.

edit: For the actual comparison between the three: treaps give no guarantees. Red-black trees sacrifice some lookup speed in favor of insertions.

I thought top trees were a data structure for implementing "dynamic trees" or "link-cut trees". How can they be used for implementing search trees?

Do the logarithmic update times in top trees take into account the time to locate the node to update?