Hacker News new | ask | show | jobs
by awirth 2188 days ago
This reduction is really cool. I love reductions like this.

Is there a general consensus to use "regular expression" to refer to the actual regular ones and "regex" to refer to the non-regular variants?

7 comments

I think Raku (neé Perl 6) has been spearheading that distinction

https://docs.raku.org/language/regexes (see the intro paragraph)

This distinction was introduced by Jeffrey Friedl’s book _Mastering Regular Expressions_ (2006), and it seems to be fairly commonly used now.
Too late to edit, but out of curiosity I re-implemented this to use the PCRE JIT (in PHP) to see what kind of speedup it would provide: https://gist.github.com/allanlw/69df509519335b88db886d48503a...

Timings for fred.cnf on my machine:

python: 0m53.744s

PHP (no PCRE JIT): (hits backtrack limit in 1m15.994s)

PHP (PCRE JIT): 0m20.109s

I wouldn't say so, but I use the term "regular language" if I mean the mathematical concept.
I don’t think it’s pedantic to say that a regular language is not the same thing as a regular expression. The difference between syntax and semantics is real and important.
(late reply) Right that's what I'm saying. Who said it was pedantic? :)
But a "regular language" is not the same as "regular expression" as mathematical concepts.
(late reply) Hm you've negelected to do the obvious thing and explain the difference.

I don't think there's a difference, and if there is one, it's probably not relevant to programming. Whereas the one I'm highlighting is relevant to programming.

I don't think so. Usually you can tell from the context:

# math and/or computer science texts? It's the regular ones.

# pretty much elsewhere? It's the extended ones.

# threads about parsing HTML with regular expressions? People using both and insisting their version is the only correct one.

Yes, for at least ten years and likely much longer.

https://cstheory.stackexchange.com/q/448/362

Interesting, I wasn't aware of the history.

Personally I make the distinction, but I've noticed many many people do not, hence the question.

People without theoretical CS background probably think of regular expressions (or regexes for short) as just text-matching thingies that sometimes give you two problems and would look at you blankly if you mention the Chomsky hierarchy. When we don’t know there’s a distinction to be made, we go for economy of expression.
Isn't it PCRE that we started to call it regex?