Hacker News new | ask | show | jobs
by 1996 2425 days ago
Why do you want to do that?

Use cat, pipe, grep, awk. Problem solved.

1 comments

I personally am not a huge fan of awk, I've never built a great mental model around its syntax and it doesn't really solve the problem I was talking about, which is getting a spreadsheet-like editing experience. Thanks for bringing it up, I should definitely add it to the article.
For the quoted spreadsheet-like operations of filtering and rearranging, awk is perfect as a deferred editor. That the kludgy first step of your chosen solution ("First, you must create a CSV file contain only the first 10-20 lines of your large CSV file") isn't just `head very_large_nov_2019.csv > very_large_nov_2019_abridged.csv` seems to further indicate an unfamiliarity with the large set of built-in, battle-tested UNIX tools for dealing with text files.

The first tools I reach for when dealing with CSVs of these and larger magnitudes are less, cut, awk, etc. They also tend to be the last tools I end up needing.

How well do those tools work with arbitrary CSV files, e.g. containing line breaks or quotes in field data? I wasn't aware that they can actually parse CSV and instead you have to assume things about the content that may not end up being true.
Every data processing task has to make assumptions about the well-formedness of its input. "Arbitrary CSV" is basically undefined; whether deviations are best dealt with by parsing, preprocessing, or different tools altogether depends on the source.