Hacker News new | ask | show | jobs
by ratzkewatzke 5193 days ago
This article is a mess, and people need to stop voting it up. For example, it contains this perplexing passage:

"Let’s say we have the string “hello world”, and let’s assume that its hash is hash(‘hello world’) = 12345. So if hash(‘he’) = 1 we can say that the pattern “he” is contained in the text “hello world”"

What the heck is he talking about here?

His hash function is O(m), where m is the length of the pattern being searched for. This makes the implementation trivially O(mn), which he notes, writing in another passage:

"The Rabin-Karp algorithm has the complexity of O(nm) where n, of course, is the length of the text, while m is the length of the pattern. [...] However it’s considered that Rabin-Karp’s complexity is O(n+m) in practice"

Next time I submit a code review, I'm going to "consider" my complexity this way.

This isn't a good guide to Rabin-Karp, which is effectively discussed on Wikipedia, in Cormen, etc. It is probably a very effective way of attracting ad dollars for the banners all over the page.

Avoid, avoid.

1 comments

Hmm? Expected running time is a valid and important concept. In Quicksort, for example, picking the pivot element randomly results in O(n^2) runtime in the worst case, but has an expected runtime of O(n log(n)). The O(n log(n)) limit can be enforced using the median-of-medians method for pivot selection, but that's a pain to program and has a much higher constant factor than generating a random number; the only benefit being theoretical computer scientists sleep well at night.