Hacker News new | ask | show | jobs
by nickdrozd 2926 days ago
The awk book is a lot of fun. I just finished reading it after seeing it recommended here a few weeks ago. The highlights for me:

1. A simple interpreter for an awk-like language called qawk. qawk is like awk except that it allows for querying by field name rather than field number. For instance, it allows doing

  { print $country, $population, $capital }
instead of the more cryptic

  { print $1, $3, $5 }
2. An awk program that takes another awk program (in their example, a sorting algorithm) and outputs a version of that program modified to include profiling statements and an END section that outputs the results of those profiling statements to some file; then, another awk program that reads the data in that file and inserts that data back into the original awk program, thereby approximating where the hotspots are.

There's a lot more in the book besides these, but to me these are the coolest programs because they are the awk-iest, by which I mean that they loop over lines of input, split the fields of those lines, and then manipulate the fields. Some of the programs in the book don't do this; instead, they consist of a single large BEGIN block with typical for-loops, arrays, etc. Used in this way, awk is just yet another dynamic language.

2 comments

Google doesn't know about qawk. A more general toolkit that does the same thing and (presumably) more:

http://github.com/dkogan/vnlog

Thank you, I remember wanting to follow up on these more abstract constructions in the book. They seemed to be leading me somewhere amazing and very computer science-y. Programs that take programs as input and generate new code to do that thing I wanted with some data files — I’m sure this will be useful if I put the time in.

Am I right that qawk was included as a program in the text? Did they ever follow up with further uses?

Yeah, the code is all in the book, and it works! Here's the main body of the qawk interpreter:

  BEGIN { readrel("relfile") }
  /./ { doquery($0) }
where

- relfile is a file containing the field attributes used in various database files,

- readrel is a function that parses the relfile and stores the fields in a dictionary, and

- doquery is a function that takes a qawk query, converts it to an awk query by replacing the field names with their corresponding field numbers, and then executes the awk command.

The whole thing runs about 60 lines.