|
|
|
|
|
by craigching
4006 days ago
|
|
Both times before this was posted it garnered 0 comments: https://news.ycombinator.com/item?id=4317545 https://news.ycombinator.com/item?id=7243568 When I read this I immediately recognized it as a PDF that I have in GoodReader (where I store all my technical PDFs). So I'm curious if it will start a good conversation about b-trees this time :) To get the conversation started, a couple of years ago, I decided to write a b-tree data structure to refresh my skills a bit. I wrote it in C since I've been doing a lot of Java the last 10 years or so. I only got as far as an in-memory b-tree and had started working on getting it stored to disk before I abandoned the project (think I was working on it during Christmas holidays and the holiday ended). I subsequently read that in-memory b-trees can actually outperform other tree data structures like rb-trees due to cache coherency. I never bothered to performance test my b-tree vs. an rb-tree I also wrote for fun. Anyway, that's my story about b-trees! |
|
But ya, B-Trees will (sometimes significantly) outperform BSTs for small keys because they will have fewer cache misses. Also interesting is that the idea that RB-Trees are faster than AVL trees for insert/delete may be outdated as well. I've seen benchmarks of AVL trees beating std::set in both libstdc++ and libc++ for both of those operations. Again this is probably due to less cache misses.
My B-Tree story is actually a B+Tree story. B-Trees (and B+Trees) can can actually be used for more than just ordered sets/maps, I've implemented a C++ sequence container[1] that allows O(log n) random access insertion and deletion. It is _much_ faster than a similar container[2] implemented using AVL trees, but at the cost of iterator stability.
[1] https://github.com/det/segmented_tree_seq
[2] http://avl-array.sourceforge.net