Current Status of the IBM Trainable Speech Synthesis System
R. Donovan, A. Ittycheriah, et al.
SSW 2001
Previous work addressing the issue of word distribution in documents has shown the importance of word repetitiveness as an indicator of the word content-bearing characteristics. In this paper we propose a simple method using a measure of the tendency of words to repeat within a document to separate the words with similar document frequencies, but different topic discriminating characteristics. We describe the application of the new measure in query-document relevance scoring. Experiments on the TREC Ad Hoc and Spoken Document Retrieval tasks show useful performance improvements.
R. Donovan, A. Ittycheriah, et al.
SSW 2001
S. Dharanipragada, Martin Franz, et al.
ICSLP 2000
Martin Franz, Salim Roukos
SIGIR Forum (ACM Special Interest Group on Information Retrieval)
S. Dharanipragada, Martin Franz, et al.
ICSLP 2000