Hacker News new | ask | show | jobs
by extortionist 1699 days ago
My job is split between maintaining a giant legacy perl codebase and doing new dev work in python. I vastly prefer python overall, but perl's still the best when you need quick one-off scripts to tear through log files or do some quick regex subs on a bunch of files.

I don't think python has an equivalent of 'perl -p -i.bak -e 's/foo/bar/g' *', for example (which will do in-place regex substitutions on whichever files are passed in, additionally copying the original files to 'original.ext.bak').

The regex syntax is also bit more streamlined in perl, so I find it quicker and easier to throw together a script that's just doing regex stuff in it. A simple example:

>> $string =~ m/(pattern)/i;

>> my $match_text = $1 || '';

>> $string =~ s/pattern//gi;

>> import re

>> match_obj = re.find(r'(pattern)', string, flags=re.I)

>> match_text = ''

>> if match_obj is not None:

>> match_text = match.group(1)

>> string = re.sub(r'pattern', '', string, flags=re.I)

Of course, the same magic that makes perl great for those one-off kinds of scripts makes it less than ideal for complex or long-living scripts that need to be maintained by multiple people, in my opinion.

1 comments

No doubt: the regex syntax in Perl is most efficient. I can envision using Perl more if I ever got back into a command line focused text manipulation workflow. I tend to use lighter-weight tools for things like your one-liner— I believe `sed -i '.bak' "s/foo/bar/" *` is equivalent, but I disagree with the neckbeard purists who say you should always use the lightest-weight tool possible. If you're using Perl anyway and Perl can get the job done with the smallest cognitive load, that's the correct tool.
Well, lots of languages use Perl compatible regexps now, even MariaDB uses PCRE nowadays.
You're right— I didn't say what I really meant there. It's not the regex syntax itself, but the syntax surrounding the use of regex's which is more efficient. Compare the number of keystrokes in his examples above (e.g. $string =~ s/search pattern/replacement string/g;) with what you'd have to do using nearly any other language. In Python for example, remember that you'd need to precompile the regex to even approach 8x slower speeds.
Well, that was inspired by sed. I have to do a test in Ruby too see how fast regexos are, being spoiled by Perl usage and all. What I don't like is this:

  my $a = 'Perl';
  my $b = $a;
  $b =~ s/pe/ea/i;
Why not do something to $a and obtain $b, just like js does with Array.map. Why the need to copy $a into to $b and then replace? Much more ellegant in Ruby:

  a = 'Ruby'
  b = a.gsub(/ru/i, 'mo')
oh— for regex performance Perl blows Swift, Java, Python, Ruby, Go, and C++ out of the water. For simple string manipulation— substring replacement, string reversal, etc. it's never the worst but never the best. Scroll down to the bottom of this paper where they test actual regexes though and the difference is pretty stunning.

http://jultika.oulu.fi/files/nbnfioulu-202001201035.pdf

No need to copy and then replace:

  my $a =  'Perl';
  my $b =~ s/pe/ea/ir;
Doesn't work. You don't assign anything to $b, so it's undef:

  % perl -MData::Dumper=Dumper -e 'my $a = 'Perl'; my $b =~ s/pe/ea/ir; print Dumper $b;'
  $VAR1 = undef;
You probably mean:

  my $a = 'Perl';
  my $b = $a =~ s/pe/ea/ir;
Anyway, thank you for the /r modifier, it didn't know what it did, since there's no example in perlre(1).