|
|
|
|
|
by wetherbeei
4687 days ago
|
|
The author found the optimal solution when trying to construct these expressions: store them in a trie. Because the generated regular expressions match only the inputs, this solution may find a more compact way to test if the inputs have lots of overlap. It would be cool if the inputs weren't matched exactly, and frak could figure out a general pattern for your inputs (decimals, capitalized words, etc). That could help newcomers with a starting expression that matches their inputs. |
|
Based on existing methods my solution started with the same trie and then generalised to a more flexible DFA by merging states. I used information theory (specifically Minimum Message Length) to turn it into an optimisation problem and tried a few different algorithms, in the addition of Ant Colony Optimisation to an existing algorithm produced the best results for my tests. (They were pretty limited, though.)
[1] http://www.cse.unsw.edu.au/~wong/Papers/jason.pdf