Hacker News new | ask | show | jobs
by jemfinch 882 days ago
This really doesn't seem to be comparing to comparable data structures. For int map specializations like this, the optimized alternatives are things like Judy (which is looking quite aged these days) or roaring bitmaps, not to mention that any C++ developer using "ordinary" maps will be using absl's SwissTable (flat_hash_map) or folly's F14 (F14FastMap) or perhaps absl::btree_map if order is important. Comparisons to std::map and std::unordered_map are simply too naive to make the case for this data structure.
1 comments

Oh come on. Don't be too harsh. This is an ordered map. Most of the mentioned ones are unordered maps. They might be fast but they are unordered. The only comparables are absl::btree_map and Judy Array. Without benchmarking, my gut feeling is this will beat absl::btree_map. Trie usually beats BTree.
I've written a lot of high performance/scale C++ code using a lot of data structures over the years, and ordered iteration has been very rarely needed; unordered data structures still rule the day in performance the vast majority of the time, and their lower constant factors very frequently outperform more specialized data structures. They're absolutely worth benchmarking against if the goal is actual uptake in the actual world.

In my experience, practically every single time I've used absl::btree_map, I've ended up reverting for performance reasons to either a flat hash map or, in some relatively rare cases, a sorted vector map (despite its O(n) insert/erase) because the constant factors are _so doggone low_. The experience remains: btree_map (or SkipLists, or whatever) has (in my experience) essentially never, in over a half million lines of C++, actually remained in the code.

Also, I presume (based on the implementation details, not based on actual use) that roaring bitmaps have some reasonable iteration API that would make them relevant even to the ordered comparison.

The important thing here is that if anyone wants to contend that someone should use their new data structure because it's better or faster or more optimal for some particular use case, it's important for them to demonstrate that they've thoroughly investigated the data structure space around their proposal. Comparison against std::map and std::unordered_map simply doesn't demonstrate the kind of comprehensive knowledge that I would expect from someone who claims that I should use their optimized integer map in my own code.

That just means you never have the need for implementing a querying or search functionality. Range query and search are used everywhere and need ordered maps.

I'm not saying we should rush to adopt it right the way, but at least don't dismiss it hastily. It has some novel ideas to advance the art.

What novel ideas did you see here? To me this looks like a standard 16-way compressed trie. The node encoding is quite natural if you consider the limitation of 64M entries (which is really not a lot). Did I miss something?
Subnet mask style prefix construction is pretty novel.