Hacker News new | ask | show | jobs
by revertts 2245 days ago
There are cases where you need them, but it's relatively rare, they definitely shouldn't be your first choice.

I'm fuzzy on all the current Linux use cases, but do remember one of the main users of rb trees was CFS. It's a neat scheduling algorithm.

Java 8 specifically: HashMap is implemented with linked list chaining. This is already not very performant, but you can only do so much with Java being a reference heavy language. RB trees are used if the bucket chain grows excessively long - so it's addressing an edge case, speeding up some worst case scenarios.

Dense hash tables based on open addressing outperform bucketed chaining. Look also at abseil's swiss table or folly's f14 if you want to see how they've further advanced to take advantage of the hardware.

In general: flat, dense, linear structures are king for performance.

2 comments

>you can only do so much with Java being a reference heavy language.

This is definitely not true, implementing array based hashtable is trivial and outperforms in most cases java.utl.HashMap. It's just the decision to have LinkedHashMap (in java 1.2) extending HashMap crippled the latter. The issue has been discussed quite a few times in java core mailing list.

I agree, HashMap is unnecessarily hamstrung because the API requirements push it into a bucket chained implementation (C++ also made this mistake and is one of the downsides of std::unordered_map). And you can definitely write a faster implementation with an array based hashtable.

But I still stand by "you can only do so much being a reference heavy language." Unless you stick to purely primitive types, implement the hash table off heap, or Project Valhalla bears fruit, it's hard to get the data layout you'd want for a really good implementation. So I agree it can be better, but it's going to be hard to get to best - hence my comment.

I do agree data layout is hard - pretty much direct byte buffers (off heap) and some poor man's memory manager (plus serialization/desirilization code).

Project Valhalla and structs is something people have been asking for since Java 1.4 or so. Still nowhere near. And indeed, "best" would take a custom implementation, case by case. It's doable, but usually very far from pretty.

I was not commenting on whether one should use it or not, just that it is still used and picked as recently as Java 8. We are in agreement for the rest of your post. :+1:
Ah! I'm sorry, I had misinterpreted your original comment.