Hacker News new | ask | show | jobs
by umvi 1680 days ago
Regex is powerful but I've found like 90% of the time I encounter one it would have been far simpler and more readable to use find + substring indexing or string splitting.

Imo replacing several nested if statements with a single esoteric regex is not necessarily a win. It depends on if pattern matching is really the best tool for the job.

2 comments

find + substring indexing

An endless source of off-by-one errors, not to mention buffer overflows, index out of bounds exceptions, accidental negative indexing.

How are you both getting buffer overflows and bounds check failures in the same code?

Anyway, these are problems that arise if you don't test the code. In that scenario, regexps are an endless source of unexpected behavior as well, including in some implementations stack overflows and ReDoS attack-surfaces.

It very much depends what the use case is. I find that a lot of the text processing I do is easier to use back references or other regexy things.

Having said this, I use tools that make regexes easy to use and readily available - I think in many programming languages the syntax means that other solutions are just as easy to devise and implement.

If you are a solo dev, you do you, but if you are working in a team and you are building huge regexes with back references and other bells and whistles... I would guess it's not very readable for your teammates. At least for me, when I look at such a regex I have to stare at it for minutes before grokking it.
I wonder if there's a metric for code reviews measuring mean-time-to-grok (MTTG).
Regexps are fairly terse and replace a lot of code, compared to most languages they probably have an information density at somewhere between 10x-100x higher (i.e. it's not rare to replace 100 lines of code with 1 regex), so I think it's fair to expect it take longer to unpack their meaning.
Wold that be a reasonable time to ping that coworker and ask them "what does this do?". Not because you can't figure it out, but because they already know?