|
|
|
|
|
by iwebdevfromhome
2179 days ago
|
|
What are your thoughts on AWS Glue/Spark ? We’re starting to have problems with data frames that won’t fit into memory anymore on 32Gb clusters and upgrading to the next option, a 64Gb cluster, is an expensive thing. We plan to migrate to glue as a long term solution but I think we need to figure out a short term solution to the issue while the migration takes place. Thanks for the article, before it I only knew of Dask as a real alternative. P.D. I just remember that I wanted to try Pandarallel as well, so you have any insight on this library ? Thanks! |
|
To be fair, that's one of the reasons that Spark ML stuff works quite well. Be warned though, estimating how long a Spark job will take/how much resources it will need is a dark, dark art.