|
|
|
|
|
by lofi_lory
1888 days ago
|
|
I think in the general case, random strings, long patterns do not improve average performance. Or rather nothing will change, once the pattern length exceeded the expected value for the alphabet used. E.g. for a simplified DNA alphabet, extended BC shifts on average 4 characters. On the fly I can't do the stochastics on GS, but I assume substring match length and chance of reoccurrence run against each other in a similar manner. On the other hand long patterns increase preprocessing/access time: Extended BC and GS have O(m) for preprocessing, and BC may have non-constant access, depending on space tradeoff. |
|