Hacker News new | ask | show | jobs
by disgruntledphd2 1989 days ago
Yeah, I suppose. I kinda think that distributed SQL is a mostly commoditised space, and wondered what replaced Spark for distributed training.

For context, I'm a DS who's spent far too much time not being able to run useful models because of hardware limitations, and a Spark cluster is incredibly good for that.

Additionally, I'd argue in favour of Spark even for ETL, as the ability to write (and test!) complicated SQL queries in R, Python and Scala was super, super transformative.

We don't really use Spark at my current place, and every time I write Snowflake (which is great, to be fair), I'm reminded of the inherent limitations of SQL and how wonderful Spark SQL was.

I'm weird though, to be fair.