Hacker News new | ask | show | jobs
by kazinator 492 days ago
[a-z] is a semantically equivalent regex to a|b|..|z, but the two are not equivalent syntactic forms.

Distinct syntactic forms can be given distinct semantics, as long as there is rhyme and reason.

Moreover, the right side of the colon is not the normal regex language, it only borrows its syntax. So there, we may be justified in denying that the usual equivalence holds between character class syntax and a disjunction of the symbols denoted by the class.

1 comments

The right side is a normal regex language syntactically. Semantically it is a generator instead of a parser (consumer).

But I got your point. Maybe there could be some ways to do it in consistent way. Just straight tr-like syntax won't work, e.g I really want it something like this to be valid:

[a-b]:(x|y) (pairs a:x, b:x, a:y, b:y)

and I prefer not handle these in some ad-hoc way.

I also go your point. The right side is a regular expression because it denotes a regular set.