Hacker News new | ask | show | jobs
by microtonal 4378 days ago
Wouldn't the FSA approach require more memory than the table, since it has to maintain all the intermediate nodes?

What do you mean by 'a table'? If you want a set without false positives/negatives or a perfect hash function, you have to represent the possible elements in some way. FSAs are compact when you store a set of sequences where there is a certain amount of redundancy in the form of subsequences that occur in multiple strings. In such cases, FSAs are far more compact that a flat table of strings.

Of course, in an actual compact implementation, you would not implement states and transitions using classes. E.g. Dictomaton stores states as an array of transitions (which are unboxed), which is far more compact (see the numbers on the Github page for some comparisons with comparable data structures).