Hacker News new | ask | show | jobs
by obtino 2509 days ago
The trouble with regular expressions is not their use, but the different "standards" (or lack thereof) and figuring out what the interpreter supports.
3 comments

Yes, two issues are a) knowing what to escape and b) what is supported or not. I was weirdly surprised to discover that there are some useful functionalities that are basically supported nowhere... except in vim. I think it was variable-length lookbehinds but don't quote me on this.
Lookbehind is restricted in many regex engines that support it (many don't at all). Some require the lookbehind to be constant-length (so if you have alternatives in there, they all have to have the same length, and you can't use quantifiers, basically). Some require it to be finite-but-known-ahead-of-time-length, so something like (?<=a{3,6}) is okay, but (?<=a{3,) is not. Also (?<=a|bb) would be okay.

.NET's System.Text.RegularExpressions.Regex is one implementation that has no restrictions on lookbehind. Having used PowerShell for so long it now happens sometimes that I forget when writing regexes for other implementations, as it's really convenient at times.

yeah, typically you would get error if you use variable length lookbehinds

a few do support it (for ex: Python's 3rd party regex module) and sometimes you could workaround with \K (similar to \zs in vim) [1]

And there are other frustrating differences between implementations, for example \g definition is very different between Ruby and Python, character set operations are not found everywhere, etc. Plus, BRE/ERE versions found in command line tools do not even support features like non-greedy and lookaround

[1] https://stackoverflow.com/questions/11640447/variable-length...

My knowledge of Regex is basic.Last year,I ended up writing a piece of code,which was reading an inbound email,parsing some of the data,and depending on the type of the email and the data stored,it then ends up creating a lead record on the system with captured data. I wrote this in Apex,which is a proprietary language of Salesforce and is based on JAVA. Apex should adhere to JAVa implementation of Regex.Some thimgs are still different.The website for regex calculations was showing one info,java docs other,and Salesforce something else all together...It took me a while..