|
|
|
|
|
by mastratton3
3682 days ago
|
|
So I did find it useful for doing additional exploratory aggregations once the data was already cleaned and denormalized. My comment was more directed at the upfront initial data processing (In our case, extracting time series data out of a large amount of files). I did hit issues w/ multiple joins and shuffling though. Have you not hit issues w/ shuffling? I was using Spark 1.5.1 for the record. |
|