|
|
|
|
|
by Cu3PO42
2694 days ago
|
|
The claims seem dubious to me as well. What exactly does faster mean here? Since a speedup of two orders of magnitude is mentioned I'm assuming it refers to matching speed. Is it faster than PCRE? Is it faster than some other engine or is the claim that it's faster than any available RegEx engine?
I had a quick look at the reference and this looks like it will accept context-free languages (since recursion is allowed). I strongly doubt that a CFG parser is magically faster than any RegEx engine. > It is hundreds of times faster than the well-known regular expressions, and the speed doesn’t degrade linearly when you add patterns. The author seems to believe that matching time for RegExes is linear in the time of the pattern? Once compiled to a DFA, the pattern only has negligible influence on the matching time. EDIT: I've been trying to figure out what kind of parser this generates. It might be a LR(k) parser? It breaks on the following (admittedly contrived) example: #S = E + End;
E = {E + "+" + E, E + "*" + E, T};
T = {[1+]Alpha, N + "()"};
N = {P, Q};
P = {"a" + P, "ab"};
Q = [1+]"a";
with an input of "aaaaaaab()+beta*c" |
|
With the right optimisations, certain longer patterns can be much faster to scan for, on account of the Boyer-Moore approach.
If I asked you to search through a book for 10 consecutive pages of the letter 'Q', you wouldn't need to check every page. The same optimisation can be applied to regex. (Not that most regex implementations bother to do it, though.)