The IBM speech activity detection system for the DARPA RATS program

George Saon; Samuel Thomas; Hagen Soltau; Sriram Ganapathy; Brian Kingsbury

INTERSPEECH 2013

Conference paper

25 Aug 2013

The IBM speech activity detection system for the DARPA RATS program

Abstract

We present the IBM speech activity detection system that was fielded in the phase 2 evaluation of the DARPA RATS (robust automatic transcription of speech) program. Key ingredients of the system are: multi-pass HMM Viterbi segmentation, fusion of multiple feature streams, file-based and speech-based normalization schemes, the use of regular and convolutional deep neural networks, and model fusion through frame-level score combination of channel-dependent models. These techniques were instrumental in achieving a 1.4% equal error rate on the RATS phase 2 evaluation data. Copyright © 2013 ISCA.

Conference paper