| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wmwragg 1087 days ago

I usually use this awk function to parse CSV in awk:

    # This function takes a line i.e. $0, and treats it as a line of CSV, breakin
    # it into individual fields, and storing them in the passed in field array. It
    # returns the number of fields found, 0 if none found. It takes account of CSV
    # quoting, and also commas within CSV quoted fields, but doesn't remove them
    # from the parsed field.
    # use in code like:
    #   number_of_fields = parse_csv_line($0, csv_fields)
    #   csv_fields[2]  # get second parsed field in $0
    function parse_csv_line(line, field,   _field_count) {
      _field_count = 0
      # Treat each line as a CSV line and break it up into individual fields
      while (match(line, /(\"([^\"]|\"\")+\")|([^,\"\n]+)/)) {
        field[++_field_count] = substr(line, RSTART, RLENGTH)
        line = substr(line, RSTART+RLENGTH+1, length(line))
      }
      return _field_count
    }

It's not perfect but gets the job done most of the time and works across all awk implementations.