Hacker News new | ask | show | jobs
by simlevesque 147 days ago
But when indexing your json or csv, if you have say 10 rows, each row is separated on your disk instead of all together. So a scan for one columb only needs to read a tenth of the disk space used for the data. Obviously this depends on the columns' content.
1 comments

But you can have a surprisingly large amount of data before the inefficiency you're talking about becomes untenable.