Hacker News new | ask | show | jobs
by njd 3502 days ago
Interesting. How would you use one of these implementations to persist a large trie beyond what physical memory can support?
2 comments

In a "typical" case with lots of words with relatively short length, I think simply allocating a stack will outperform this "zero-allocation" scheme, because this algorithm will overwrite every memory page.

I think this algorithm might be a win if the trie contains a huge word (say, 1GB).

mmap?
Yes. But, the trie implementation would need to somehow allocate mmap file address space. Depending on supported size of the trie, may require multiple mmap files. The addressing scheme would need to consider that. If you want the trie distributed, how would you do that? Would you distribute by some function on the key, essentially resulting in N tries?

Also, if you want to support all 256 values in a byte for each byte of the string to be entered into the trie, the whole idea of using a list structure seems inefficient. On average, to find the next byte in the trie would require 128 comparison operations.

You could write your own memory allocator, overload malloc/alloca etc. and/or `operator new[]` along with `delete`.

http://en.cppreference.com/w/cpp/memory/new/operator_new