| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by OskarS 3019 days ago

That's an overstatement of the differences between various regex engines. They all follow the basic standards, with [] being character classes, () being submatches, * being "0 or more", + being "1 or more", etc.

The two main differences between various engines are which characters are "literal" and which characters are "magic" (Vim's engine is particularly annoying here), and how to write the "convenience character classes" (like what the shorthand for "alphanumeric character class" is). But these are minor issues, once you've learned how to write a regex, these are trivial to look up.

Knowledge of regular expressions transfer from one engine to another just fine.

3 comments

thefifthsetpin 3019 days ago

I generally include either \v or \V in my vim regex, at which point I no longer have to think about which characters are magic. I suppose this means that I agree that vim's default is annoying here, but imho vim more than makes up for that by making magic configurable.

link

ken 3019 days ago

> They all follow the basic standards, with [] being character classes, () being submatches

You've already described a feature which has different syntax in one of the primary regex dialects I use (Emacs).

link

pygy_ 3018 days ago

That's the syntactic differences, but there are also semantic ones.

Most notably, the choice operator can either be ordered like in PEGs (if the first branch matches, the other isn't evaluated) or pick the branch that produces the longest match, CFG-like.

link