New developments in voice biometrics for user authentication
Hagai Aronowitz, Ron Hoory, et al.
INTERSPEECH 2011
We are interested in comparing training methods for designing better decoders. We treat the training problem as a statistical parameter estimation problem. In particular, we consider the conditional maximum likelihood estimate (CMLE)—the value of unknown parameters which maximizes the conditional probability of words given acoustics during training. We compare it to the maximum likelihood estimate (MLE)—the estimate obtained by maximizing the joint probability of the words and acoustics. For minimizing the decoding error rate of the (“optimal”) maximum a posteriori probability (MAP) decoder, we show that the CMLE (or maximum mutual information estimate, MMIE) may be preferable when the model is incorrect and, in this sense, the CMLE/MMIE appears more robust than the MLE. © 1988 IEEE
Hagai Aronowitz, Ron Hoory, et al.
INTERSPEECH 2011
Arthur Nádas
IEEE Transactions on Acoustics, Speech, and Signal Processing
Wael Hamza, Raimo Bakis, et al.
ICSLP 2004
Arthur Nádas
IEEE Transactions on Acoustics, Speech, and Signal Processing