Association control in mobile wireless networks
Minkyong Kim, Zhen Liu, et al.
INFOCOM 2008
We describe a system for model based speech separation which achieves super-human recognition performance when two talkers speak at similar levels. The system can separate the speech of two speakers from a single channel recording with remarkable results. It incorporates a novel method for performing two-talker speaker identification and gain estimation. We extend the method of model based high resolution signal reconstruction to incorporate temporal dynamics. We report on two methods for introducing dynamics; the first uses dynamics in the acoustic model space, the second incorporates dynamics based on sentence grammar. The addition of temporal constraints leads to dramatic improvements in the separation performance. Once the signals have been separated they are then recognized using speaker dependent labeling.
Minkyong Kim, Zhen Liu, et al.
INFOCOM 2008
Daniel M. Bikel, Vittorio Castelli
ACL 2008
Nanda Kambhatla
ACL 2004
Sameer Maskey, Bowen Zhou, et al.
ICSLP 2006