|
|
|
|
|
by FreakLegion
4046 days ago
|
|
Indeed there are a lot of good optimization options for tries. At a little bit of a speed penalty you can use a bitmap for navigation instead of pointers, which obliterates the memory footprint. In some use cases the result is actually smaller than the equivalent Bloom filter with a sane FP rate (though generally not smaller than a Golomb-compressed set or cuckoo filter). A compromise between the speed and memory bloat of pointers is to use Radix/Patricia tries, which are a solid go-to option. In some cases they'll be slower than naive tries, but in others they'll be faster. The fastest and most memory-efficient implementation I've seen, though, boiled each pattern in the trie down to its most unique dword or qword, and only attempted a full match after an initial hit. Locality of reference can work wonders here -- we got a 20x speedup over an already very fast naive trie this way. |
|
Oh that's interesting - like a very fast cmp before bothering to check the rest of the branches.
When you say pattern are you talking about specific letters/morphemes or subsequence? And were you pulling the [dq]words from the initial pattern or generating them some otherway (through some kind of hash or encoding function?).