I never understood why people find regex so intimidating. Obviously you probably didn't look to find the worst of all, but one you posted is very straightforward.
You jest, but that regex looks machine-generated. My Emacs is full of these in places used for syntax coloring, but I know these are optimized. There's an elisp function, regex-opt, into which you can throw a bunch of strings, and you get out a regex like above.
To be honest I was serious. Personally I believe that regular expressions is one of few tools that super useful even for people outside of IT because everyone have to extract of format some text or table data from time to time. You can even learn them just by playing game:
Regex are dreaded as difficult to comprehend, but the real danger in using them is more subtle - especially nowadays when you'd have most text as UTF-8, possibly escaped, etc. and regex are prone to misbehave in odd ways, and introduce security issues - they should only be handled by expert programmers. Even parsing apparently simple stuff like email addresses, IP addresses, phone numbers and date/time is tricky, far beyond what a newbie would expect. There's a reason we have dedicated validation functions in PHP for all of the above. That said, regex have their use case too, and if your parsing case is not covered by a dedicated function, are usually the best option.
I never understood why people who understand regex don’t understand people who don’t understand regex. Obviously you are not the worst of all, but it’s not that hard to imagine how a regex looks to someone who doesn’t know regex, is it?
I couldn't agree more. I know regex fairly well and parsing regex is still annoying and takes a lot more concentration than just reading normal code.
Plus there are so many cases where people build insane regex where they are just the wrong tool for the job, e.g. parsing/extracting or manipulating HTML. It always starts out with "I just need the src from that <img>, what could go wrong" and ends in despair, because you never just need that src and you never only deal with perfect html and you'd be done already if you had just used some dom parser.
Yeah, I get that regular expressions might look complex and tangled like brainfuck looks for me since I never tried to learn it. Yet I just see comments on how regular expressions are hard to understand from all kind of IT people who solving hundred times more complex puzzles every day. I guess it's just reputation that stick to certain technology and really have nothing to do with actual complexity.
Experience i guess. I've spent hundreds of hours on debugging and fixing regexes that other people wrote - usually just to find there's a quirk in certain regex parser implementation.
Regexes are easy to understand if you write them, but reading them can take lots of time.
Note that HN formatting messed it up (there are stars missing before the first two closing parens). The regex itself is indeed quite straightforward, just a bit hard to read due to all the required backslash-escaping.