Hacker News new | ask | show | jobs
by throwbsidbdk 3486 days ago
Fun factoid most have forgotten: regex is perl. The beginnings are elsewhere but regex as we know it was designed as part of the language and the engine was pulled out and reused when people found how useful it was.

Perls regex parser is still far above the features in more modern languages, supporting, among other things, code execution within capture groups. If I remember right the perl regex parser is actually Turing complete

5 comments

I know you're not trying to say that regular expressions were created as part of Perl, but I think you're giving a bit too much credit to it[1] regarding regexes.

The PCRE library is indeed used all over. And Perl was, I think, the first first-class scripting language that integrated regexes so closely to control structures and other language features in a way that feels truly natural.

There are still a lot of tools out there that use other regex libraries. Don't have it in front of me, but there's a lovely chart in the book _Mastering Regular Expressions_[2] that breaks out regular expression library use by tool. But, generally, I think the diversity of regex libraries actually causes problems for adoption these days, because people who are tempted to use them (thus learn more) tend to run in to other tools where the things they've learns mysteriously don't work anymore, and scares them off.

Anyway, regular expressions in the wild go back to Unix v.4, which included Ken Thompson's grep.

[1] Perl deserves a ton of credit it doesn't get in general, including credit for giving the world PCREs.

[2] In general, if you work with regexes a lot and don't own this book, you're doing yourself a disservice. It is one of my top-10 technical books, not just for density of actionable information, but also for the pure general excellence.

Have a look at grammars in Perl6 and the new regexen. Light years ahead of anything else. Perl6 also does numeric division properly and, if I'm not mistaken, eliminates NPEs so what's not to like?
What you describe happened far earlier. As far as I understand it, regexps were originally a part of ed (having been derived from QED), the original Unix text editor. Its “g” command with a “p” flag, or “g/re/p”, for globally searching for a regexp and printing the matching lines, was later found so useful that it was implemented into a separate utility, “grep”. Many Unix utilities started using regular expressions from then on, including Perl.
Maybe I was a hit too excited about the perl part. Perl perfected regex and the perl regex engine was integrated into other languages until it became a normal language feature.

Regex as we know it was largely a result of the adoption of perl and the flexibility of its regex engine

Thus the PCRE regex library, Perl-Compatible Regular Expressions, for instance.
Note that this library is by Philip Hazel and did not originate in the Perl source code, but in Exim.

Regex facilities for text processing were first implemented by Ken Thompson, long before Perl.

On the topic of implementations, this is important: https://swtch.com/~rsc/regexp/regexp1.html

PCRE is a nice library. I read (on its site, IIRC) that it is used for the regex support in Python and some other languages.

I once worked - as part of new product work in an enterprise company - on building the PCRE library as an object file on multiple Unixes from different vendors (like IBM AIX, HP-UX, Solaris, DEC Ultrix, etc.) and also on Windows (including on both 32-bit and 64-bit variants of some of those OSes), using the C compiler toolchain on each of those platforms. I was a bit surprised to see the amount of variation in the toolchain commands and flags (command-line options) across the tools on all those Unixes. But on further thought, knowing about the Unix wars [1] and configure [2], maybe I should not have been surprised.

[1] https://en.wikipedia.org/wiki/Unix_wars

[2] https://en.wikipedia.org/wiki/Configure_script

grep == global regular expression print and has its origins in ed from pre-vi days if my memory of passed on lore is correct.