|
|
|
|
|
by stellographer
4028 days ago
|
|
I see you feel strongly about this. "The point of Metropolis-Hastings is to sample from a distribution when you do not know the partition function."
That's one point, yes. The other is optimization. In which case I prefer the others I mentioned. "they do not seem to have either theoretical justification or empirical success."
That's just patently false. They _do_ have empirical success. And besides, a lot of these variants have MH implemented inside of them to some extent o.o "MH is embarrassingly parallel since you run multiple chains at the same time."
I can make six single grilled cheese sandwiches in 2 minutes, but it takes me 8 minutes to make a 6-decker grilled cheese. Is that a parallel process? "Being 100 years old is also largely irrelevant."
In my opinion it is relevant since computation can now be done in completely new ways than 100 years ago. |
|
2) When you are sampling a distribution, you're not trying to make a 6-decker grilled cheese, you're trying to make many grilled cheese sandwiches.
3) In completely new ways? Not really. Algorithmic complexity which dominates run time independent of the computing medium. An algorithm designed to be efficiently run by a group of human "computers" with calculators is probably very similar to the same algorithm designed to be run by a CPU. If anything, the CPU optimized algorithm are likely to benefit from more sequential processing and less parallelism.