Hacker News new | ask | show | jobs
by pletnes 1035 days ago
Lots of science is «big data, small but important metadata». Also «big raw data, small result data» use cases are out there. (I used to do hyperspectral stuff for a while, which lets you record tons of sensor data to get a small and neat result, think TB -> kB). So GB might not be the best or only metric, as such.
1 comments

My story for Datasette and Big Data at the moment is that you can use Big Data tooling - BigQuery, Parquet, etc, but then run aggregate queries against that which produce an interesting ~10MB/~100MB/~1GB summary that you then pipe into Datasette for people to explore.

I've used that trick myself a few times. Most people don't need to be able to interactively query TBs of data, they need to be able to quickly filter against a useful summary of it.