Hacker News new | ask | show | jobs
by hobs 1825 days ago
There's certainly some of that and I have experienced project managers asking me to put 5GB datasets in spark... but there's definitely a set of problems where vertical scaling is a PITA and MPP basically generally breaks the SQL guarantees anyway, costs a milli, requires rewrites, etc.

When you want to process N+1 TB/PB its hard to throw standard relational approaches at it imo.

SQL is strings all the way down, testing the database itself is often shitshow...

1 comments

While I agree that it can easily be "strings all the way down", as often the way folks make spark testable is only slightly more advanced than using views in a sql world. Add in an understanding of windowing functions, and some trivial assertions on expected query results go a long way.