| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by EdSchouten 836 days ago
	Are there actually practical cases where Rabin-Karp hashing is what dominates the running time of an application? The naive implementation already gives you 0.75 GB/s. Seems pretty fast.

3 comments

def-lkb 836 days ago

Compression, synchronization and backup systems often use rolling hash to implement "content-defined chunking", an effective form of deduplication.

In optimized implementations, Rabin-Karp is likely to be the bottleneck. See for instance https://github.com/facebook/zstd/pull/2483 which replaces a Rabin-Karp variant by a >2x faster Gear-Hashing.

link

mis1988 836 days ago

I found it to be a good pedagogical tool. In terms of actual use cases, maybe something like rsync'ing a large directory where only a few files have changed? It starts to get pretty contrived though.

link

dehrmann 836 days ago

Maybe implementing `foo LIKE '%bar%'` in a columnar DB backed by SSDs. Even if you're not strictly CPU-bound, there's a scale where you might care about the power use.

link