|
|
|
|
|
by b2gills
2892 days ago
|
|
I like the way Perl 6 handles this with the grammar feature.
(A grammar is just a special type of class, with a regex as just a special type of method.) It could be simpler, but I want the resulting data structure to be easier to use. grammar Url {
# default regex/token/rule/method to call
# (token disables backtracking)
token TOP {
<protocol> <domain> <path> <query> <fragment>
}
token protocol {
<(
<[a..z]> ** 3..10
)> # don't include :// in the stringified result
'://' # must be escaped as it isn't alphanumeric
}
token domain-segment { <-[?#/.]>+ }
token domain {
<domain-segment> ** 2..* # at least 2 domain segments
% '.' # separated by .
<?{
# make sure that the last segment is at least 3 chars
# (using the Boolean result of regular Perl 6 code)
@<domain-segment>.tail.chars >= 3
}>
}
token path-segment { <-[?#/\\]>+ }
token path {
[
<[/\\]>
<path-segment>*
%% <[/\\]> # separated by path separator (allow trailing)
]?
}
token query-segment {
# store as named, rather than positional
$<key> = ( <-[#=&]>+ )
'='
$<value> = ( <-[#=&]>+ )
# run regular Perl 6 code in the regex
{
# attach a Pair object as the AST
make ~$<key> => val(~$<value>)
# (`val` turns a numeric value into an allomorph)
}
}
token query {
[
'?'
<( # don't include ? in the stringified result
<query-segment>*
% '&' # separated by & (no trailing allowed)
)>
]?
{
# attach a static associative array of the key value pairs
# as the AST
make Map.new: (@<query-segment>».ast if @<query-segment>.elems)
}
}
token fragment {
[
'#'
<( .* )> # don't include '#' in the stringified result
]?
}
}
Example usage: > my $result = Url.parse('http://perl6.org/foo/bar/baz/?a=1&b=2#fragment');
> say $result;
「http://perl6.org/foo/bar/baz/?a=1&b=2#fragment」
protocol => 「http」
domain => 「perl6.org」
domain-segment => 「perl6」
domain-segment => 「org」
path => 「/foo/bar/baz/」
path-segment => 「foo」
path-segment => 「bar」
path-segment => 「baz」
query => 「a=1&b=2」
query-segment => 「a=1」
key => 「a」
value => 「1」
query-segment => 「b=2」
key => 「b」
value => 「2」
fragment => 「fragment」
> say $result<query>.ast;
Map.new((:a(IntStr.new(1, "1")),:b(IntStr.new(2, "2"))))
> my %query := $result<query>.ast;
> say %query<b> ~~ Int; # True (because of val(…))
True
A more advanced usage would be with an actions class.Basically Perl 6 treats regular expressions as code that is written in a domain specific sub-language, with grammars acting as a structure to hang them off of. |
|