Murat Saraclar, Abhinav Sethy, et al.
ASRU 2013
In this paper, we explore the use of lattices to generate pronunciations for speech recognition based on the observation of a few (say one or two) speech utterances of a word. Various search strategies are investigated in combination with schemes where single or multiple pronunciations are generated for each speech utterance. In our experiments, a strategy that combines merging time-overlapping links in a context-dependent subphone lattice and generating multiple pronunciations provides the best recognition accuracy. This results in average relative gains of 30% over the generation of single pronunciations using a Viterbi search.
Murat Saraclar, Abhinav Sethy, et al.
ASRU 2013
Hagen Soltau, Lidia Mangu, et al.
ASRU 2011
Sabine Deligne, Ellen Eide, et al.
INTERSPEECH - Eurospeech 2001
Mohamed Kamal Omar, Lidia Mangu
ICASSP 2007