| HN Mirror

> Sat solvers have been implemented on backtracking regex engines.

Again, if the (not-really-)regular expression grows as the SAT problem does, that doesn’t mean game over yet. Your favourite ahead-of-time regex compiler (the good interactive ones are usually incremental, but lex/flex or re2c certainly fit) is also exponential in the size of the regex, because determinization can have exponential-size output. After the (exponentially large) compiled form is produced, though, matching is linear in the size of the haystack no matter what it is.

Burntsushi’s sibling reply links to blog post and thence to a proof-of-concept regular+backreferences expression matcher[1] that works in (quite miserable but still) polynomial time wrt the haystack, with the degree of the polynomial depending on the number of groups that have backreferences to them.

That’s not exactly fantastic, mind you—recall that you can match arbitrary context-free languages in “merely” cubic time. But it’s not exponential.

[1] https://github.com/travisdowns/polyregex