|
|
|
|
|
by they4kman
1578 days ago
|
|
I imagine only having to store 5 bits for the differences accounts for a large swath of the savings. With the first letter bucketing the rest of the words in a sort of trie-like behaviour, I wonder if there's enough duplication of successive characters for a full trie to be worth it. After all, there would need to be extra bits used to store leaf/node identity, as well as whether there are _more_ nodes past the current leaf. I suppose instead of using markers to denote ends, the parent node could store a count, but that might be wasteful for buckets at the fourth character. I'm very curious about this. |
|