Hacker News new | ask | show | jobs
by jqpabc123 2030 days ago
The problem I see is the assumption of a uniform distribution.

The problem spaces I normally deal with involve clumps of data, not really uniform except within each clump.

1 comments

is there something that's better than interpolation search and binary search, for that kind of data?
It's often possible to do "better" by tuning something toward the data set. Finding something that is always better is hard.

One thing I have done is a double binary search.

Store prefixes and suffixes seperately. A binary search of the prefixes identifies the suffix clump to search with a binary search. This involves a slight increase in storage --- each prefix needs a suffix pointer.

But maybe I will now try an interpolation search on the suffixes.