|
|
|
|
|
by danielparks
493 days ago
|
|
Cool, I’m interested to see where you go with this. I found the operator precedence unnatural, and it looks like a lot of other folks in this thread did too. I would naturally assume `cat:dog` would be equivalent to `(cat):(dog)` rather than `ca(t:d)og`. |
|
> I would naturally assume `cat:dog` would be equivalent to `(cat):(dog)` rather than `ca(t:d)og`
It was confusing to me too until I remembered that we all kind of use regexes sort of wrong. They're "really" supposed to be considered as generators and not matchers. So IIR cat|dog as a "regular expression" (not a regex) is supposed to formaly expand to
{catog,cadog}
For matching, this set of strings can then be substring matched against some larger text.
The problem is that almost no regex matching engine actually does this, and so now they'll do all kinds of strange things either to meet our expectations, or for efficiency or something.
If you go and try a bunch of different regex tools you'll get variations that either service (cat)|(dog) or (cat)|(dog)|(ca[td]og) or something else.
So from a more formal conceptualization I think cat:dog should produce ca(t:d)og not (cat):(dog). But our experience with "regex" tools has subverted that formalization and now everybody just puts parens around expressions they want to alternate.
My real minor issue with this proposal, as interesting and well thought out as it is, is that it feels like it's just trying to get back at regular expressions as generators, which they actually are and it's coming from a place on the other side of a few decades of how we've been abusing them as regexes for user expectations. In other words, the problem is the tooling, not the syntax.
source: I've worked adjacent to this space in the past and if you've never thought of regexes as string set generators you can toy with the idea here
https://onlinestringtools.com/generate-string-from-regex
but again, how these generator tools works is also very specific. The ones I used to work with had a variety of ways to specify constraints on closures and such to restrict the generators.