Hacker News new | ask | show | jobs
by bazoom42 420 days ago
> I tend to prefer to write out the procedural code, even if it is (much) longer in terms of lines.

This might work for you, but in general the amount of bugs is proportional to the amount of code. The regex engine is alredy throughly tested by someone else while a custom implementation in procedural code will probably have bugs and be a lot more work to maintain if the pattern changes.

3 comments

> This might work for you, but in general the amount of bugs is proportional to the amount of code.

If you wanted to look for cases which serve as an exception to this rule, code relying on regexes would be an excellent place to start.

That is quite a generalization. The regex engine is tested, but my specific regular expression isn't. My ability to write correct regular expressions is weak, so there can be many bugs in the one line of regular expession.
If you have made a bug in the specification of the pattern to match, then you will have the same bug in the hand-rolled implementation of the matching. It will just be more difficult to find the bug since the pattern is not explicitly specified anymore.
In general, the correctness of the code is proportional to its readability.

I also prefer procedural code instead of regexes.

Surely complexity is a factor? A procedual implementation will necessarily have the same essential complexity as the regex it replaces, but then it will additionally have a bunch of incidental complexity in matching and looping and backtracking.

Regexes can certainly be hard to read - the solution is to use formatting and comments to make them easier to understand - not to drown the logic in reams of boilerplate code.

> A procedual implementation will necessarily have the same essential complexity as the regex it replaces

I don't think I fully agree with this, and I don't see a basis for why this should be true. If I have a very specific implementation, it could have very little incidental complexity, it could be fully targeted to the use case. Whereas with regular expressions there is incidental complexity of the regex engine itself by definition.

Complexity in the standard library is not that relevant. If you make your own custom dictionary implementation, you increase complexity of your code base compared to just using the one in the standard library, even if your own implementaion is simpler.

The relevant complexity for using a regex is the complexity of the pattern itself and the complexity of invoking the regex. Any custom procedural solution will be more complex unless it is literally something as simple as checking whether a string contain a given literal string.

For some arbitrary definition of complex.