| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by feqgmmr2 1681 days ago
	Spark has always been able to handle way larger scale than any DW.

2 comments

hello_moto 1675 days ago

Handle what though?

Can Spark queries 100Bn structured data performing aggregation on multiple fields (or dimension?)

link

geoduck14 1681 days ago

In my previous company, we had 63 petabytes of data in Snowflake.

link

hello_moto 1675 days ago

That sounds great: storage problem is solved.

What about large scale read via OLAP queries (y'know, the typical measures and dimensions)

link

blueglassfish 1681 days ago

That's a respectable amount for a DW, true. Spark and it's ilk are designed for much larger scales though. Multiple FAANG use cases for Spark are in the petabytes per week range.

link