Journal:Informatica
Volume 27, Issue 3 (2016), pp. 573–586
Abstract
Phoneme duration modelling is one of the stages in prosody modelling for text-to-speech systems. The rule-based phoneme duration model proposed by Klatt (1979) is still quite a popular method. One of the main shortcomings of this method is that the values of the parameters are selected in an experimental way. This work proposes a new iterative algorithm for the automatic estimation of the factors for the Klatt model using the corpus of an annotated audio record of the speaker. The phoneme duration models were built for three different Lithuanian speakers. The quality of the estimation of phonemes durations was evaluated by the root mean square error, the mean absolute error and the correlation coefficient.