Hacker News new | ask | show | jobs
by theamk 801 days ago
You normally develop the tools and scripts on a much smaller data sets. So you export 1 minute of data, examine it manually or with simple tools, write your scripts and once they work, switch to processing months worth of it.

Human-readable comes handy here.

2 comments

A pretty printer is almost as handy and with significantly fewer compromises.
I prefer SQL for tabular data larger than 3 screens high, for me it’s easier for basically any analysis compared to a grep/wc/count/cut/paste bash pipeline. I use sqlite for CSV pretty regularly, if I needed columnar I’d use duckdb for parquet