| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by iainmerrick 3447 days ago

OK, I'll bite: this is not something the vast majority of engineers ever need to know.

They are both balanced binary trees with the same big-O complexity. Constant factors are different, but if and when you care about that, they're both binary trees so it should be easy to switch one implementation for another. In practice you're unlikely to care because most of the time you won't be working on performance-critical code. All speculation about performance is hearsay unless you run some benchmarks.

Google search brings up this discussion: http://discuss.fogcreek.com/joelonsoftware/default.asp?cmd=s... There's plenty of interesting stuff there, but it rather reminds me of medieval theologians debating how many angels can dance on the head of a pin.

There's a much, much more important distinction: binary trees (of all kinds) versus sorted arrays. There are many cases where a std::vector will be a lot faster than a tree, due to cache coherency, and use much less memory too.

So, discussing AVL trees and red-black trees in an interview is a waste of time. All it tells you is whether somebody once studied them (and remembers their studies), or possibly just memorized them the night before. Knowing that somebody studied those algorithms (except you don't know, because they may just have crammed it) would be a positive signal but doesn't actually tell you whether their CS course covered useful up-to-date topics, like cache-friendly algorithms and data structures.

1 comments

tetraodonpuffer 3447 days ago

> There are many cases where a std::vector will be a lot faster than a tree, due to cache coherency, and use much less memory too.

this is also what is annoying me, the thought of so much ink being devoted to O complexity and so on when with modern processors in the end often "less optimal" algorithms are a lot faster based on how they work and often you end up writing the same thing 3 different ways so you can test and see which one is actually fastest given your language/compiler/toolchain/processor

In a book-level discussion whether a comparison in your algorithms evals to true or false in a predictable or unpredictable pattern doesn't make a difference in its performance, however write that out in code and the branch predictor of your CPU will be a LOT happier (and faster) if you make it so that all the "false" and "true" comparisons happen in streaks as opposed to randomly...