|
|
|
|
|
by jnaour
4233 days ago
|
|
Good introduction. Spark is really a project to watch in the data analysis field on distributed architecture. We had performed several benchmarks and Spark keeps its promisses. 2.5x faster comparing to Pig for the same algorithm on the same cluster. For iterative algorithm with the in-memory possibilities, performances are really good comparing to Hadoop. The project is still young with several bugs but the documentation is really good and the code is well commented and robust. |
|
In nearly every test Naiad has beaten Spark.
More info on Naiad: http://research.microsoft.com/en-us/projects/naiad/