| No! The point of Metropolis-Hastings is to sample from a distribution when you do not know the partition function. It is the most important building blocks in a set of algorithms broadly known as Markov Chain Monte Carlo. These algorithms are particularly useful when performing Bayesian statistics. Genetic algorithms will not give you samples from a distribution, they only perform optimization. Particle swarms also focus on optimization, and on top of that, they do not seem to have either theoretical justification or empirical success. MH is embarrassingly parallel since you run multiple chains at the same time. Again, the point isn't optimization (that would be simulated annealing) but sampling. Being 100 years old is also largely irrelevant. People will publish new algorithms to get publications all the time. That do not mean they necessarily outperform the old ones. Gradient descent is the basic algorithm used in training all of these cool new deep learning algorithms, and it's much older than MH. Yes, there are more recent improvements to MH, the two biggest ones being Hamiltonian Monte Carlo (which uses gradient information) and Parallel Tempering (which is somewhat similar to homotopy optimization), but that's hardly a reason to dismiss the importance of this algorithm. |
"The point of Metropolis-Hastings is to sample from a distribution when you do not know the partition function." That's one point, yes. The other is optimization. In which case I prefer the others I mentioned.
"they do not seem to have either theoretical justification or empirical success." That's just patently false. They _do_ have empirical success. And besides, a lot of these variants have MH implemented inside of them to some extent o.o
"MH is embarrassingly parallel since you run multiple chains at the same time." I can make six single grilled cheese sandwiches in 2 minutes, but it takes me 8 minutes to make a 6-decker grilled cheese. Is that a parallel process?
"Being 100 years old is also largely irrelevant." In my opinion it is relevant since computation can now be done in completely new ways than 100 years ago.