|
|
|
|
|
by dyoo1979
4372 days ago
|
|
You'll probably want to read the FlumeJava paper. http://pages.cs.wisc.edu/~akella/CS838/F12/838-CloudPapers/F... Citation: http://dl.acm.org/citation.cfm?id=1806638 The key word is pipeline. If you have some analysis that runs in several stages, you'll be taking the output of one stage, and connecting it to the next. If you want to compose multiple phases, chained together, raw MapReduce isn't going to help you very much with the chaining. What's described in the paper is a way to do the chaining in a nice way. The system will take care of writing the raw MapReduces for you. But it'll also do a lot of work on the interconnections between your stages as well. |
|