|
|
|
|
|
by threeseed
4049 days ago
|
|
Spark is fast becoming the default tool for big data. The recent addition of SparkR in 1.4 means that now data scientists can leverage in memory data in the cluster that has been put there by output from either Scala or DW developers. Combine it with Tachyon (http://tachyon-project.org) and it's not hard to imagine petabytes of data all processed in memory. |
|
I haven't used either Spark or Tachyon. I thought the Spark solution was to just put my dataset in memory. But the Tachyon page seems to say the same thing