Hacker News new | ask | show | jobs
by throw681158 2182 days ago
Re: already in a string, one of the primary uses of regex is to search from point in a text editor. So, cursor is in a string and you want to find the next string. Regex won't work on its own, you generally need more semantic information to differentiate opening & closing quotes (unless you can use local context from that particular language to infer it).

But more broadly, any situation where you search from a non-zero index has this problem.

I'm surprised your example works in Python. Is that a property of Python's parser, or all regex matchers?

1 comments

> Regex won't work on its own, you generally need more semantic information

Yeah, I agree. My point was that regex won't work, regardless of if you have backreferences or not. So backreferences won't help.

> But more broadly, any situation where you search from a non-zero index has this problem.

I'm not sure I understand that. A lot of regex libraries let you specify a start index. It won't take into account data from before the start index though (regex doesn't really do that, regardless of backreferences). If your regex library doesn't support passing in a start index, you can just take a substring starting at that index, then search the substring.

I don't think Python is really special. Python's findall() is just a convenience function that does a loop finding a match, then finding another match that starts after the first match, etc. Most languages provide a way to find the end point of the most recent match, and then you can just write the loop yourself to start the next search at that point.