|
|
|
|
|
by studentrob
4047 days ago
|
|
Can you explain what Tachyon does that's different from what Spark already provides? I haven't used either Spark or Tachyon. I thought the Spark solution was to just put my dataset in memory. But the Tachyon page seems to say the same thing |
|
Basically, Tachyon acts as a distributed, reliable, in memory file system.
To generalise enormously, programs have problems sharing data in RAM. Tachyon lets you share data between (say) your Spark jobs and your Hadoop Map/Reduce jobs at RAM speed, even across machines (it understands data-locality, so will attempt to keep data close to where it is being used).
[1] http://www.cs.berkeley.edu/~haoyuan/talks/Tachyon_2014-10-16...