| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hello_moto 1681 days ago
	It'll take a few more years until these companies fixed all the bugs and address all the scalability issues. As of today, these companies are not good enough to take on the Data Warehouse part.

1 comments

feqgmmr2 1681 days ago

Spark has always been able to handle way larger scale than any DW.

link

hello_moto 1675 days ago

Handle what though?

Can Spark queries 100Bn structured data performing aggregation on multiple fields (or dimension?)

link

geoduck14 1681 days ago

In my previous company, we had 63 petabytes of data in Snowflake.

link

hello_moto 1675 days ago

That sounds great: storage problem is solved.

What about large scale read via OLAP queries (y'know, the typical measures and dimensions)

link

blueglassfish 1681 days ago

That's a respectable amount for a DW, true. Spark and it's ilk are designed for much larger scales though. Multiple FAANG use cases for Spark are in the petabytes per week range.

link