Online speaker diarization using adapted i-vector transforms
Weizhong Zhu, Jason Pelecanos
ICASSP 2016
The performance of a typical speaker verification system degrades significantly in reverberant environments. This degradation is partly due to the conventional feature extraction/compensation techniques that use analysis windows which are much shorter than typical room impulse responses. In this paper, we present a feature extraction technique which estimates long-term envelopes of speech in narrow sub-bands using frequency domain linear prediction (FDLP). When speech is corrupted by reverberation, the long-term sub-band envelopes are convolved in time with those of the room impulse response function. In a first order approximation, gain normalization of these envelopes in the FDLP model suppresses the room reverberation artifacts. Experiments are performed on the 8 core conditions of the NIST 2008 speaker recognition evaluation (SRE). In these experiments, the FDLP features provide significant improvements on the interview microphone conditions (relative improvements of 20-30%) over the corresponding baseline system with MFCC features. © 2011 IEEE.
Weizhong Zhu, Jason Pelecanos
ICASSP 2016
Mohamed Kamal Omar, Lidia Mangu
ICASSP 2007
Steven Rennie, Pierre Dognin, et al.
ICASSP 2011
Seyed Omid Sadjadi, Sriram Ganapathy, et al.
Odyssey 2016