Hacker News new | ask | show | jobs
by orangepurple 1390 days ago
I know this isn't the correct definition but I think of "big data" as the set of data which takes me more than 15 minutes to query on average with a moderately complex Postgres SQL join on well indexed information. I use JSONB in Postgres regularly and have indices on that too. So far I have gotten really far with increasing Postgres work_mem to a gig or more, a fast SSD, and strategically placed materialized views. These kinds of operations in Pandas make my computer billow smoke by comparison.