|
|
|
|
|
by pathikrit
4661 days ago
|
|
[DAWGs](http://en.wikipedia.org/wiki/Directed_acyclic_word_graph) are a special kind of Trie where similar child trees are compressed into single parents. I extended modified DAWGs and came up with a nifty data structure called ASSDAWG (Anagram Search Sorted DAWG). The way this works is whenever a string is inserted into the DAWG, it is bucket-sorted first and then inserted and the leaf nodes hold an additional number indicating which permutations are valid if we reach that leaf node from root. This has 2 nifty advantages: Since I sort the strings before insertion and since DAWGs naturally collapse similar sub trees, I get high level of compression (e.g. "eat", "ate", "tea" all become 1 path a-e-t with a list of numbers at the leaf node indicating which permutations of a-e-t are valid).
Searching for anagrams of a given string is super fast and trivial now as a path from root to leaf holds all the valid anagrams of that path at the leaf node using permutation-numbers. |
|