Hacker News new | ask | show | jobs
by lairv 426 days ago
My issue with regexes is that the formal definition of regex I learned at university is clear and simple [0] but then using them in programming languages is always a mess

[0] https://en.wikipedia.org/wiki/Regular_expression#Formal_lang...

1 comments

The issue is the formal definition of regex only deals with whether a string belongs to language recognized by regex or not (boolean accept/non-accept), but regex in practice often talks in terms of "find the substring (if any) that matches". Which then causes issues because a regex is equivalent to an NFA so a given string can be matched in possibly multiple ways, which forces you to bring in the notion of a "greedy" vs "non-greedy" match in order to disambiguate. And then add in top of that the desire to define sub-matches in terms of capturing groups, and it's just a complete mess. And that's not even getting to not-strictly regular PCRE extensions like lookaround, backreferences, etc.