|
|
|
|
|
by alextp
4594 days ago
|
|
ML PhD student here. The reason why this is different is that the parallel monte carlo simulations are running on different subsets of the data in each machine, and then averaged. It is not obvious that this can work at all in some cases. Think, for example, a clustering model. If there are two clusters, but one machine calls them A B and the other machine calls them B A, averaging will give you useless results. So the contribution of this paper is finding a set of models on which naive averaging works, and showing an efficient mapreduce implementation of it. That said, I don't find the paper particularly interesting. |
|
Academics tend to trivialize the implementation (we had some pretty strong critics of his talk in my dept), but some kudos are in order for that, even if the algorithm itself isn't revolutionary.
[1]: http://www.stat.columbia.edu/~gelman/research/unpublished/co...