|
|
|
|
|
by ztorkelson
2944 days ago
|
|
I've found myself reaching for radix trees more and more for the same reason you listed: competitive and predictable performance. That said, I can see two counter-arguments: 1. Rehashing can be avoided if the table size is known in advance.
2. Rehashing is immaterial if the data structure is write-once, read-many. In those cases, we might reasonably expect a hash table to perform better. |
|
Examples from my own small library (http://25thandclement.com/~william/projects/phf.html), using 1 million random integers:
If I hash the items as C++ string objects rather than uint32 scalars then the time is doubled, presumably because of the dynamic allocation. Which goes to show the "perfect" in perfect hash table doesn't need to be the dominate cost.That said, 90% of the time I just use a simple red-black tree--a so-called intrusive implementation. It makes memory management much easier in C, and rare is the situation where a hash table (let alone a perfect hash table) can make a real difference relative to focusing on all the other inefficiencies.