| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by St-Clock 5415 days ago

I agree and disagree:

"If I were to write out an explicit loop over the characters of the string, I would be a lot more sure that I wasn't accidentally dropping characters due to an inadvertent failure to make the regexp exhaustive"

This is why regex comments exist. For any non-trivial regex (more than a two or three characters), you should break down and document your regex. Otherwise, it's worse than a 1000 character-long perl one-liner.

"it's a lot easier to accidentally write an exponential-time algorithm in a regexp than in a nested loop"

So true. I did not realize it was possible until I made that mistake. Debugging these cases is extremely difficult. For two strings that look similar, the same regex can go crazy on one. But this happened to me only once in the past three years (time when I started to heavily rely on regular expressions for a project).

1 comments

kragen 5415 days ago

> This is why regex comments exist.

Regex comments don't help much with inadvertently writing a non-exhaustive regex (i.e. one for which some possible input could fail to match), or a few other kinds of regexp bugs. Or, how would you write the regexp in the above code with comments so that it would be obvious if you left out the \\$ case?

link