|
I admit I have never truly learned awk outside of the most dead simple stuff, but one of the most useful python utilities I have ever written is below. It allows you to use python lambdas on lines of stdin. Example usage is: lambda "x0 + x1 * x2" int int int
Code: import sys
# libs I use commonly...
import random, itertools, re
# parse the types of the columns
types = map(eval, sys.argv[2:])
# craft a lambda with the args x0, x1, ... xN
f = eval("lambda " + ','.join("x"+str(i) for i in xrange(len(types))) + ":" + sys.argv[1])
# apply lambda to stdin, don't print results of None
for line in sys.stdin:
args = []
for t, e in zip(types, line.strip().split()):
args.append(t(e))
result = print f(*args)
if result != None:
print result
Examples:Where data.txt is john doe 37
jane doe 35
jack bob 20
bill bob 40
And I do cat data.txt | lambda "x0 if x2 < 36 else x1" str str int
It will output doe
jane
jack
bob
You can use this sort of tool for a million things. e.g. sample 1 out of every 1000 lines: lambda "x0 if random.random() <= .001 else None" str
It's probably the same power and whatnot as awk, but I know and am much more familiar with python, so it's useful for me. |