Hacker News new | ask | show | jobs
by p3ll0n 5821 days ago
A main problem for speaker independent automatic speech recognition systems is the variability of the speech signal - i.e. the same sequence of words uttered by different speakers or even uttered several times by one speaker never results in identical speech signals.

ROS (rate of speech) is one of the primary contributors to this variably and some recent research (http://www.ee.columbia.edu/~dpwe/papers/PfauR98-spkrate.pdf) has shown that good estimates of speaking rate can be obtained using vowel detection as vowels in general correspond to syllable nuclei.

I wonder if Sukhotin's algorithm could be modified to improve upon this work?

1 comments

Probably only in the same sense that Sukhotin's algorithm could be modified to implement the rules to Super Mario Brothers.