Hacker News new | ask | show | jobs
by jmnicolas 4497 days ago
What I really wished for when I read the title was for something that could make me write regexes but verbosely.
3 comments

You can write more verbose regex using Named Capture Buffers. Here is an example I posted on HN not long ago: https://news.ycombinator.com/item?id=6895126

NB. Follow link to original post to compare against standard regex version.

There are also some nice grammar parsers available in some languages which make this even easier. For examples of this see Perl6 Rules/Grammar, Perl5 Regexp::Grammars or (for something which doesn't used regex at all is) Rebol Parse.

For eg. Here is my Rebol version of the HN post above: http://www.reddit.com/r/programming/comments/1smpa1/why_rebo...

And here is a presentation which shows a great example using Perl6 grammars: http://jnthn.net/papers/2014-fosdem-perl6-today.pdf

Refs:

- http://en.wikibooks.org/wiki/Perl_6_Programming/Grammars

- http://en.wikipedia.org/wiki/Perl_6_rules

- https://metacpan.org/pod/Regexp::Grammars

- http://www.rebol.com/docs/core23/rebolcore-15.html

- http://blog.hostilefork.com/why-rebol-red-parse-cool/

PS. Alternatively f you looking for something interactive then checkout tools like these: http://rebol.informe.com/blog/2013/07/01/parse-aid/ | https://metacpan.org/pod/Regexp::Debugger

In what way would you write regexes verbosely? I'm actually quite interested in the idea because regexes can be confusing to write at times, and it's difficult to remember which form to use where, if you use them in many languages/interfaces.

There are tools like Regexper[1] that let you visualize the regex as an automata graph, and there are tools like text2re[2] which will allow you to put in text and visually generate a regex to match it.

I feel like better regex tools should exist on the command line, and it's potentially a great place for such tools to be rapidly developed and adopted. There are GUI tools for this like poirot[3], but the command line still exists because of its accessibility, uniformity, and extensibility.

links:

[1] http://www.regexper.com/

[2] http://txt2re.com/index.php3?s=24%3AFeb%3A2014+%22This+is+an...

[3] http://www.espgraphics.com/poirot/

I'm probably heavily biased, but to me Perl is the best command-line regex tool. Perl was invented to gather data and report on it, and its regex engine is incredibly fast and powerful. As an added bonus it supports some Python and PCRE-specific extensions. But this Q app is useful for people who either don't know Perl or can get what they need done faster with SQL than with scripting.

In terms of 'verbosity' you can embed comments inside a regular expression, or build a regular expression over multiple lines, or make a set of regex objects and interpolate them into larger regex's. Perl has copious amounts of documentation to help you understand the many ways to use regexs in Perl.

http://perldoc.perl.org/perlrequick.html http://perldoc.perl.org/perlretut.html http://perldoc.perl.org/perlfaq6.html#How-can-I-hope-to-use-...

> As an added bonus it supports some Python and PCRE-specific extensions.

This is a bit of a strange thing to say, since nearly all of the advanced regex features showed up in Perl first. PCRE stands for "Perl-compliant regular expressions," so there's certainly no extensions there that didn't originally come from Perl. I'm less sure about Python, but I get the sense that they borrow from Perl regular expressions as well.

http://perldoc.perl.org/perlre.html#PCRE/Python-Support

  PCRE/Python Support
  
  As of Perl 5.10.0, Perl supports several Python/PCRE-specific extensions to the
  regex syntax. While Perl programmers are encouraged to use the Perl-specific
  syntax, the following are also accepted:
  
      (?P<NAME>pattern)
      Define a named capture group. Equivalent to (?<NAME>pattern).
  
      (?P=NAME)
      Backreference to a named capture group. Equivalent to \g{NAME} .
  
      (?P>NAME)
      Subroutine call to a named capture group. Equivalent to (?&NAME).
> In what way would you write regexes verbosely?

Something like SQL would be fine.

It's not a really thought out theory, but I think I'd like to manipulate text via a programming language like VI gods manipulate text with shortcuts.

Thanks for the link I will have a look at them.

A good compromise is to use something like Grok. https://github.com/jordansissel/grok