| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by b2gills 2890 days ago

Perl 6 regexes attempt improve upon this situation by making regexes more like a regular programming language. That is it errs on the side of error detection rather than encoding efficiency. (It also adds features that would be difficult to add to Perl 5/PCRE regex design)

For a start if it didn't support using `+`, then any attempt to use it would generate a compiler error because it is not alphanumeric. (regex is code in Perl 6)

All non-alphanumeric characters are presumed to be metasyntactic, and so must be escaped in some way to match literally. Arguably best way is to quote it like a string literal. (Uses the same domain specific sub-language that the main language uses for string literals)

    / "+" + /   # at least one + character

It really is a significant redesign.

    /A{2,4}/    # Perl 5/PCRE
    /A ** 2..4/ # Perl 6

    /A (?:BA){1,3}/x
    /A [BA] ** 1..3/ # Perl 6: direct translation
    /A ** 2..4 % B/  # Perl 6: 2 to 4 A's separated by B

    /A (?:BA){1,3} B?/x
    /A ** 2..4 %% B/   # Perl 6: %% allows trailing separator

    /\" [^"]* \"/x     # Perl 5/PCRE
    /\" <-["]>* \"/    # Perl 6: direct translation
    /｢"｣ ~ ｢"｣ <-["]>*/ # Perl 6: between two ", match anything else
                       # (can be used to generate better error messages)

    ---

    # Perl 5
    my $foo = qr/foo/;
    'abfoo' =~ /ab $foo/x;

    # Perl 6
    my $foo = /foo/;
    'abfoo' ~~ /ab <$foo>/;
    # or
    my token foo {foo}     # treat it as a lexical subroutine
    'abfoo' ~~ /ab <&foo>/;

    ---

    # Perl 5
    my $foo = 'foo';
    'abfoo' =~ /ab \Q $foo \E/x; # treat as string not regex
    # Perl 6
    my $foo = 'foo';
    'abfoo' ~~ /ab $foo/; # that is the default in Perl 6