Hacker News new | ask | show | jobs
by rttlesnke 4416 days ago
I've been studying HMMs lately. I think getting an initial estimate of HMM emission and transition parameters using Segmental K-means training (or Viterbi training) before applying Baum-Welch re-estimation should result in the latter converging better. That's what the HInit tool in HTK does. AFAIK, it's done in the following way:

- Divide all examples (observation sequences) uniformly into as many segments as the number of states.

- Cluster the observations corresponding to each state, and estimate the GMM using the cluster set so that each cluster corresponds to one multivariate Gaussian.

- Do this repeatedly until convergence: get the Viterbi alignment of all examples, use it to get new segments, and estimate the parameters again using the previous step.

Please correct me if I'm wrong. Also, I have two questions:

- What kind of accuracy increase should be expected if using both Viterbi training and Baum-Welch re-estimation, instead of just the latter?

- What kind of accuracy should be expected if only using Viterbi training?

1 comments

In my experience Viterbi training alone is often enough to get reasonable accuracy, at least in speech recognition. Instead of doing the more costly Baum-Welch training you can spend your time better elsewhere, e.g. use deep neural networks instead of GMMs or collect more data.
Interesting - I've never tried Viterbi training. Maybe it is worth implementing after all. I plan to do a hybrid DNN-HMM (or whatever it is called now) with pylearn2 in a followup post.