|
|
|
|
|
by semiquaver
300 days ago
|
|
> Regolith attempts to be a drop-in replacement for RegExp and requires minimal (to no) changes to be used instead
vs > Since Regolith uses Rust bindings to implement the Rust Regex library to achieve linear time worst case, this means that backreferences and look-around aren't available in Regolith either.
Obviously it cannot be a drop-in replacement if the regex dialect differs. That it has a compatible API is not the only relevant factor. I’d recommend removing the top part from the readme.Another thought: since backreferences and lookaround are the features in JS regexes which _cause_ ReDOS, why not just wrap vanilla JS regex, rejecting patterns including them? Wouldn’t that achieve the same result in a simpler way? |
|
This lets you have full backwards compatibility in languages like Python and JS/TS that support backreferences/lookarounds, without running any risk of DOS (including by your own handrolled regexes!)
And on modern processors, a suitably implemented check for a timeout would largely be branch-predicted to be a no-op, and would in theory result in no measurable change in performance. Unfortunately, the most optimized and battle-tested implementations seem to have either taken the linear-time NFA approaches, or have technical debt making timeout checks impractical (see comment in [0] on the Python core team's resistance to this) - so we're in a situation where we don't have the best of both worlds. Efforts like [1] are promising, especially if augmented with timeout logic, but early-stage.
[0] https://stackoverflow.com/a/74992735
[1] https://github.com/fancy-regex/fancy-regex