|
|
|
|
|
by Robin_Message
5779 days ago
|
|
Good luck building a Unicode trie -- the branching factor would be too high, never mind lookup time. Instead, you'd make the trie of an encoding, probably UTF-8 (off the top of my head) that would enable you to keep the branching factor at 256, which is already rather large but doable (You can switch to Judy arrays if the wasted space bothers you.) Does anyone know, is there a Unicode encoding that enables you to map arbitrary ranges (so I can, for example, use the greek alphabet only at 1 byte per character or less)? I suppose UTF-8 is already hard enough to decode. |
|
http://en.wikipedia.org/wiki/Ternary_search_tree
http://www.strchr.com/ternary_dags