|
|
|
|
|
by gorset
528 days ago
|
|
The number of unique int32 values is not that great, and a full bitset would only be 512MB. Hard to bit a dense bitset in performance. As a general purpose data structure, I would look at roaring bitmaps for this purpose, which has good trade-offs with great performance for most use-cases. If only simple lookups are needed over a static set, then it's worth looking at minimal perfect hash functions (https://sux.di.unimi.it). |
|
There's also the extreme of simply storing the answer for each possible query as a u32 and just index the array, but there the overhead is much larger.
Rank-select is also interesting, but I doubt it comes anywhere close in performance.
I happen to also be working on a minimal perfect hashing project that has way higher throughput than other methods (mostly because batching), see https://curiouscoding.nl/posts/ptrhash-paper/ and the first post linked from there.