Hacker News new | ask | show | jobs
by niggler 4913 days ago
Where's the love for awk? It has been tucked away in a sub item but doesn't deserve first class status?
3 comments

That whole "find -ls | awk" is wicked slow anyway; try wc and xargs...

  $ time find -ls | awk '{s += $7} END {print s}'
  15970582120

  real	0m27.721s
  user	0m1.256s
  sys	0m1.780s

  $ time find | xargs wc -c 2> /dev/null | tail -1
  604260969 total

  real	0m0.332s
  user	0m0.068s
  sys	0m0.204s
The standard disclaimer on find | xargs: you should use -print0 and -0 to avoid problems with files with whitespace in their names, i.e.

   $ find -print0 | xargs -0 wc -c 2> /dev/null | tail -1
(Also, many uses of find | xargs can be replaced with -exec cmd {} \; or -exec cmd {} +, e.g.

  $ find -exec wc -c {} + 2> /dev/null | tail -1
although this isn't much faster in this case.)
You sure that's not because of memory swapping? Once warmed up the awk command is much faster for me.

Also, the results are different - though I'm too lazy to figure out why right now :)

You need to filter out directory entries.

    find -type f -ls|awk '{s += $7} END {print s}'
    find -type f -print0 | xargs -0 wc -c | tail -1
    find -type f -exec wc -c {} + | tail -1
I am not familiar with awk, what's the 's+= $7'? What is the argument 2 passed to wc? Why is the produced output different? What am I missing here?
s += $7 means "add the content of the 7th column to the total"
Makes sense, thanks!
2013 will be the year of awk on the desktop.
actually there is heaps of love for awk... so much actually that I'd rather spend a whole post on it than to "just" make it one item :)
Motivated by your list, I went through my history to pull some awk snowclones http://news.ycombinator.com/item?id=4989524