Orly Stettiner, Dan Chazan
ICPR 1994
This paper presents a novel approach for concatenative speech synthesis. This approach enables reduction of the dataset size of a concatenative text-to-speech system, namely the IBM trainable speech synthesis system, by more than an order of magnitude. A spectral acoustic feature based speech representation is used for computing a cost function during segment selection as well as for speech generation. Initial results indicate that even with a dataset size of a few megabytes it is possible to achieve quality which is significantly higher than existing small footprint formant based synthesizers.
Orly Stettiner, Dan Chazan
ICPR 1994
Jennifer C. Lai, Kwan Min Lee
ICSLP 2002
Zvi Kons, Hagai Aronowitz
INTERSPEECH 2013
Raul Fernandez, Asaf Rendel, et al.
ICASSP 2013