Building Text Corpus for Unit Selection Synthesis
Volume 25, Issue 4 (2014), pp. 551–562
Pub. online: 1 January 2014
Type: Article
Received
1 February 2012
1 February 2012
Accepted
1 October 2014
1 October 2014
Published
1 January 2014
1 January 2014
Abstract
Abstract
The present paper deals with building the text corpus for unit selection text-to-speech synthesis. During synthesis the target and concatenation costs are calculated and these costs are usually based on the prosodic and acoustic features of sounds. If the cost calculation is moved to the phonological level, it is possible to simulate unit selection synthesis without any real recordings; in this case text transcriptions are sufficient. We propose to use the cost calculated during the test data synthesis simulation to evaluate the text corpus quality. The greedy algorithm that maximizes coverage of certain phonetic units will be used to build the corpus. In this work the corpora optimized to cover phonetic units of different size and weight are evaluated.