Hacker News new | ask | show | jobs
by dcolkitt 2584 days ago
> As a busy self-learner, when is it time to learn data structures in depth?

From a practical perspective... The best sign is if you're spending a lot of time trying to wrangle performance or scalability and most of the bottleneck is inside the standard data structures.

E.g. if you're spending too much time or memory doing dictionary lookups, then it may be worth looking into Bloom filters. But if dictionary lookups are a trivial component of your overall performance, then there are no real gains to be had from replacing with a fancier data structure.

There's a follow-up point I'd like to make too. Skill with data structures is more about knowing the right questions to ask, rather than memorizing an encyclopedic list of a whole bunch of exotic structures. When you understand the major tradeoffs and design considerations underlying data structure design, then it's usually easy to find a solution in the literature even if you were completely unaware of it prior.

Effectively that means something like being able to translate between business requirements and academic language. "Oh, I need something that works on directed graphs, handles cycles, logarithmic in space, linear in average case time, resistant to adversaries, respects cache locality during lookup, and behaves deterministically."

The best way to get learn this stuff is to read a bunch of famous papers in data structure. Again, the point isn't to memorize a zoo. It's to get comfortable with the already well-developed framework that previous researchers have used. It's very unlikely that your business problem is truly unique at a theoretical level, so most likely someone before you has already figured out the tradeoffs and considerations that you are facing.

1 comments

> "Effectively that means something like being able to translate between business requirements and academic language. "Oh, I need something that works on directed graphs, handles cycles, logarithmic in space, linear in average case time, resistant to adversaries, respects cache locality during lookup, and behaves deterministically."

Excellent comment! I'd love to see a catalog of data structures that documented those characteristics; do you know of a good resource like that?

Unfortunately, I don't have any great suggestions for a single book or resource. (Doesn't mean something good isn't out there, just not anything I've come across.)

For me personally, the best resource has been simply to work through famous papers in the space. Blogs like the Morning Paper, centered around breaking down papers, are also helpful.

Would also like to know if there's a resource like this out there.