Pub. online:5 Aug 2022Type:Research ArticleOpen Access
Journal:Informatica
Volume 16, Issue 2 (2005), pp. 193–202
Abstract
One of the components of the text-to-speech synthesis system is the database of sounds. Two Lithuanian diphone databases in the MBROLA format are presented in this paper. The list of phonemes and the list of diphones necessary for Lithuanian text-to-speech synthesis are described. The problem of phoneme combinations that are not used in the Lithuanian language is dealt with in the work. Also, the article is concerned with transcribing a Lithuanian text.
Journal:Informatica
Volume 27, Issue 3 (2016), pp. 573–586
Abstract
Phoneme duration modelling is one of the stages in prosody modelling for text-to-speech systems. The rule-based phoneme duration model proposed by Klatt (1979) is still quite a popular method. One of the main shortcomings of this method is that the values of the parameters are selected in an experimental way. This work proposes a new iterative algorithm for the automatic estimation of the factors for the Klatt model using the corpus of an annotated audio record of the speaker. The phoneme duration models were built for three different Lithuanian speakers. The quality of the estimation of phonemes durations was evaluated by the root mean square error, the mean absolute error and the correlation coefficient.
Journal:Informatica
Volume 25, Issue 4 (2014), pp. 551–562
Abstract
Abstract
The present paper deals with building the text corpus for unit selection text-to-speech synthesis. During synthesis the target and concatenation costs are calculated and these costs are usually based on the prosodic and acoustic features of sounds. If the cost calculation is moved to the phonological level, it is possible to simulate unit selection synthesis without any real recordings; in this case text transcriptions are sufficient. We propose to use the cost calculated during the test data synthesis simulation to evaluate the text corpus quality. The greedy algorithm that maximizes coverage of certain phonetic units will be used to build the corpus. In this work the corpora optimized to cover phonetic units of different size and weight are evaluated.
Journal:Informatica
Volume 11, Issue 1 (2000), pp. 19–40
Abstract
The paper deals with one of the components of text-to-speech synthesis of the Lithuanian language, namely – automatic text stressing. The present work substantiates the necessity to divide words into fixed and variable parts used to build different grammatical forms, as well as to store only those parts rather than the whole worlds in the dictionary. According to the inflexion method, all words of the Lithuanian language are divided into three groups (noun-adjectives, verbs and non-inflectional words) and each group is analysed separately. The type of information, as well as the form in which it is to be stored, has been established for each group and the algorithm by means of which the grammatical form of a word can be recognised and stressed, has been presented.
Journal:Informatica
Volume 10, Issue 4 (1999), pp. 367–376
Abstract
This paper deals with one of the components of text-to-speech synthesis of Lithuanian language namely – text transcription. Formal rules' method is used for text transcription. In this work the suitability of this method is grounded, an analysis of the form of rules to fit is made and the set of rules and interpreting algorithm is presented. Contextual information, features of stress, syllable boundaries and softness are used in the rules.