|
|
|
|
|
by tma-1
3679 days ago
|
|
I have been extensively using the dateframe/sql API and I just love it. Most of the issues I have had stemmed from the cluster / Spark configuration and not the API itself. Using SQL is so much more intuitive them using multiple joins, selects, filter etc on an rdd. |
|
I did hit issues w/ multiple joins and shuffling though. Have you not hit issues w/ shuffling?
I was using Spark 1.5.1 for the record.