I like the idea of Unix pipelines, but I hate all the sublanguages, awk being one of the biggest. I scratched my itch and built my own shell, marcel: https://github.com/geophile/marcel.
I mention this specifically, here, because of the CSV point. Marcel handles CSV, e.g. "read --csv foobar.csv" reads the foobar.csv file, parses the input (getting quotes and commas correct), and yields a stream of Python tuples, splitting each line of the CSV into the elements of the output tuples.
Marcel also supports JSON input, translating JSON structures into Python equivalents. (The "What's New" section of marcel's README has more information on JSON support, which was just added.)
I usually use this awk function to parse CSV in awk:
# This function takes a line i.e. $0, and treats it as a line of CSV, breakin
# it into individual fields, and storing them in the passed in field array. It
# returns the number of fields found, 0 if none found. It takes account of CSV
# quoting, and also commas within CSV quoted fields, but doesn't remove them
# from the parsed field.
# use in code like:
# number_of_fields = parse_csv_line($0, csv_fields)
# csv_fields[2] # get second parsed field in $0
function parse_csv_line(line, field, _field_count) {
_field_count = 0
# Treat each line as a CSV line and break it up into individual fields
while (match(line, /(\"([^\"]|\"\")+\")|([^,\"\n]+)/)) {
field[++_field_count] = substr(line, RSTART, RLENGTH)
line = substr(line, RSTART+RLENGTH+1, length(line))
}
return _field_count
}
It's not perfect but gets the job done most of the time and works across all awk implementations.
I mention this specifically, here, because of the CSV point. Marcel handles CSV, e.g. "read --csv foobar.csv" reads the foobar.csv file, parses the input (getting quotes and commas correct), and yields a stream of Python tuples, splitting each line of the CSV into the elements of the output tuples.
Marcel also supports JSON input, translating JSON structures into Python equivalents. (The "What's New" section of marcel's README has more information on JSON support, which was just added.)