Y
Hacker News
new
|
ask
|
show
|
jobs
by
gberger
2069 days ago
What do you recommend for distributed data processing?
2 comments
MrPowers
2069 days ago
Dask is a great alternative for distributed computing as well:
https://github.com/dask/dask
IMO, Spark is better for some tasks and Dask is better for others.
link
peteradio
2069 days ago
First step is decide if you really need distributed data processing. I think this is the point author is making. I've seen GB sized data considered "BIG DATA" and its unbelievable the architectural patterns used to support this "BIG DATA".
link
IMO, Spark is better for some tasks and Dask is better for others.