Hacker News new | ask | show | jobs
by feqgmmr2 1681 days ago
Spark has always been able to handle way larger scale than any DW.
2 comments

Handle what though?

Can Spark queries 100Bn structured data performing aggregation on multiple fields (or dimension?)

In my previous company, we had 63 petabytes of data in Snowflake.
That sounds great: storage problem is solved.

What about large scale read via OLAP queries (y'know, the typical measures and dimensions)

That's a respectable amount for a DW, true. Spark and it's ilk are designed for much larger scales though. Multiple FAANG use cases for Spark are in the petabytes per week range.