Bhuvana Ramabhadran, Jing Huang, et al.
INTERSPEECH - Eurospeech 2003
This paper applies the recently proposed Extended Maximum Likelihood Linear Transformation (EMLLT) model in a Speaker Adaptive Training (SAT) context on the Switchboard database. Adaptation is carried out with maximum likelihood estimation of linear transforms for the means, precisions (inverse covariances) and the feature-space under the EMLLT model. This paper shows the first experimental evidence that significant word-error-rate improvements can be achieved with the EMLLT model (in both VTL and VTL+SAT training contexts) over a state-of-the-art diagonal covariance model in a difficult large-vocabulary conversational speech recognition task. The improvements were of the order of 1 % absolute in multiple scenarios.
Bhuvana Ramabhadran, Jing Huang, et al.
INTERSPEECH - Eurospeech 2003
Sabine Deligne, Ellen Eide, et al.
INTERSPEECH - Eurospeech 2001
Jennifer C. Lai, Kwan Min Lee
ICSLP 2002
Youssef Mroueh, Etienne Marcheret, et al.
AISTATS 2017