Speech synthesis for a specific speaker based on a labeled speech database

R. Hoory; D. Chazan

doi:10.1109/ICPR.1994.577142

ICPR 1994

Conference paper

09 Oct 1994

Speech synthesis for a specific speaker based on a labeled speech database

View publication

Abstract

This paper proposes a new text-to-speech synthesis technique, for producing continuous, natural sounding speech of a specific speaker. The synthesis technique is based on selecting short speech frames from a phoneme-labeled speech database. The selection procedure involves minimization of a distortion criterion, by a dynamic programming algorithm. The proposed scheme is more flexible than many existing schemes using fixed speech segments, such as diphones. It results in a more natural synthesized speech. An efficient speech representation is used to express simply and accurately the spectral continuity of speech. A further improvement in the database search mechanism and in database size was obtained by sectioning the speech phonemes into "steady-states"and "transitions". The resulting synthesized speech quality, is satisfactory and indeed preserves the natural voice of the speaker.

Paper