Hacker News new | ask | show | jobs
by cs101 2327 days ago
> productivity skyrocketed

Was the increased productivity due to Awk or was it due to Perl?

2 comments

Really, any of the scripting languages.

My recollection is there was a lot of disdain for scripting languages because they could not match the speed of C or other low level languages. Today, computer speed is so many orders of magnitude faster it’s hard to believe the speed of scripting languages was ever an issue.

John Ousterhout’s paper on programmer productivity gains comes to mind:

https://web.stanford.edu/~ouster/cgi-bin/papers/scripting.pd...

Just a few days ago I wrote a simple awk script to parse some log files but it was horrendously slow. I had to replace understandable loops with weird calls to builtin functions to make it fast enough for my usecase.
You're doing something wrong. I've used awk to run big data reformatting jobs in under an hour that took most of a day to run in Scala on an Apache Spark cluster. In the vast majority of cases today, if speed is your problem, then you are the problem - especially since most problems fit into RAM these days, even w/o exotic stuff like RAMcloud...
try mawk, I've had it run 4x faster than gnu awk on some things.
This! It was not unusually to have 100folks sharing a 20mhz workstation. Trying to run interpreted language was a pain! Heck, even compiling a few hundred line C code would take seconds.
Perl was 10x faster in benchmarks I did than awk. When Yahoo benchmarked scripting languages, mod_perl won but they chose PHP anyway.

Also Perl has better support for programming-in-the-large than awk, with modules, lexical scoping and CPAN.

Booking.com, IMDB and I believe the Amazon frontend are all written in Perl.

Which version of awk did you benchmark though? Mawk is pretty fast, for bread-and-butter awk stuff often significantly faster than perl. E.g.

    perl -anE 'say($F[0]) if /error/' big.log >/dev/null  0.85s user 0.02s system 99% cpu 0.867 total


    mawk '/error/ {print $1}' big.log > /dev/null  0.21s user 0.03s system 99% cpu 0.246 total
Not the OP but mawk wasn't really a thing in the 90s. Although the project was started in the mid 90s it only really saw a year of development before it languished unmaintained until 2009.
Not so. Mawk was released in 1991. Perl (as in Perl5) in 94.

https://github.com/mikebrennan000/mawk-2/blob/master/about-m...

1991 to 1996 (when it was abandoned) isn't all that long for a programming language; particularly when there are already mature and widely deployed implementations out there for awk. Also bare in mind that in the 90s people weren't as disciplined about keeping their OS updated so mawk might never have made it onto peoples systems unless they built it themselves. Which is a hard sell if you've already got awk installed on the host given the point of running awk is a short term productivity gain (ie if you were going to the trouble to compile mawk then you might as well write your script in something lower level to begin with)

Plus if you're going to talk about older builds of mawk then you can't really ignore older versions of Perl as well (which was originally released in 1987). Otherwise you're not making a fair comparison.

I should add, I have absolutely nothing against mawk. It just wasn't something available on any of the POSIX systems I used in the 90s. tbh even now it's an optional install but at least it is an easier install than it was in the 90s.

Awk has funtions with named parameters.

Local variables unfortunately can only be obtained in the form of extra parameters (which are not passed by the caller).

  function foo(x, y,   # params
               z, w)   # locals
  { 
  }
There is a convention to separate the two by some obvious whitespace.

There are no block scoped-locals: all locals have to go into the parameter list. Initial values cannot be specified.

(Speaking of which, there are hacks for simulating the feature of optional parameters with defaulted values.)

In GNU Awk, from the following experiment, the scope appears lexical:

  function bar()
  { 
    return x
  }

  function foo(x)
  {
    x = 3
    return bar();
  }

  BEGIN { x = 42; print foo(); }
The output is 42, which means that the "x = 3" assignment to the local variable x in foo does not affect the access to the free variable x in bar, as it would under dynamic scope.