Lattice-based Viterbi decoding techniques for speech translation
George Saon, Michael Picheny
ASRU 2007
In a large vocabulary speech recognition system using hidden Markov models, calculating the likelihood of an acoustic signal segment for all the words in the vocabulary involves a large amount of computation. In order to run in real time on a modest amount of hardware, it is important that these detailed acoustic likelihood computations be performed only on words which have a reasonable probability of being the word that was spoken. We describe a scheme for rapidly obtaining an approximate acoustic match for all the words in the vocabulary in such a way as to ensure that the correct word is, with high probability, one of a small number of words examined in detail. Using fast search methods we obtain a matching algorithm that is about a hundred times faster than doing a detailed acoustic likelihood computation on all the words in the IBM Office Correspondence isolated word dictation task which has a vocabulary of 20 000 words. We give experimental results showing the effectiveness of such a fast match for a number of talkers. © 1993 IEEE
George Saon, Michael Picheny
ASRU 2007
Yang Wang, Zicheng Liu, et al.
CVPR 2007
Daniel A. Vaquero, Rogerio S. Feris, et al.
WACV 2009
Hans-Werner Fink, Heinz Schmid, et al.
Journal of the Optical Society of America A: Optics and Image Science, and Vision