Bowen Zhou, Bing Xiang, et al.
SSST 2008
Various language modeling issues in a speech-to-speech translation system are described in this paper. First, the language models for the speech recognizer need to be adapted to the specific domain to improve the recognition performance for in-domain utterances, while keeping the domain coverage as broad as possible. Second, when a maximum entropy based statistical natural language generation model is used to generate target language sentence as the translation output, serious inflection and synonym issues arise, because the compromised solution is used in semantic representation to avoid data sparseness problem. We use N-gram models as a post-processing step to enhance the generation performance. When an interpolated language model is applied to a Chinese-to-English translation task, the translation performance, measured by an objective metric of BLEU, improves substantially to 0.514 from 0.318 when we use the correct transcription as input. Similarly, the BLEU score is improved to 0.300 from 0.194 for the same task when the input is speech data.
Bowen Zhou, Bing Xiang, et al.
SSST 2008
Haiping Li, Fangxin Chen, et al.
ICASSP 2003
Ruhi Sarikaya, Yuqing Gao, et al.
ICASSP 2004
Fu-Hua Liu, Yuqing Gao
ISCSLP 2004