Pub. online:5 Aug 2022Type:Research ArticleOpen Access
Journal:Informatica
Volume 16, Issue 2 (2005), pp. 193–202
Abstract
One of the components of the text-to-speech synthesis system is the database of sounds. Two Lithuanian diphone databases in the MBROLA format are presented in this paper. The list of phonemes and the list of diphones necessary for Lithuanian text-to-speech synthesis are described. The problem of phoneme combinations that are not used in the Lithuanian language is dealt with in the work. Also, the article is concerned with transcribing a Lithuanian text.
Journal:Informatica
Volume 27, Issue 3 (2016), pp. 573–586
Abstract
Phoneme duration modelling is one of the stages in prosody modelling for text-to-speech systems. The rule-based phoneme duration model proposed by Klatt (1979) is still quite a popular method. One of the main shortcomings of this method is that the values of the parameters are selected in an experimental way. This work proposes a new iterative algorithm for the automatic estimation of the factors for the Klatt model using the corpus of an annotated audio record of the speaker. The phoneme duration models were built for three different Lithuanian speakers. The quality of the estimation of phonemes durations was evaluated by the root mean square error, the mean absolute error and the correlation coefficient.
Journal:Informatica
Volume 25, Issue 4 (2014), pp. 551–562
Abstract
Abstract
The present paper deals with building the text corpus for unit selection text-to-speech synthesis. During synthesis the target and concatenation costs are calculated and these costs are usually based on the prosodic and acoustic features of sounds. If the cost calculation is moved to the phonological level, it is possible to simulate unit selection synthesis without any real recordings; in this case text transcriptions are sufficient. We propose to use the cost calculated during the test data synthesis simulation to evaluate the text corpus quality. The greedy algorithm that maximizes coverage of certain phonetic units will be used to build the corpus. In this work the corpora optimized to cover phonetic units of different size and weight are evaluated.
Journal:Informatica
Volume 12, Issue 2 (2001), pp. 315–336
Abstract
The paper deals with automatic stressing of the Lithuanian text. In the previous work the author presented an algorithm for automatic stressing of the Lithuanian text on the basis of a dictionary. The aim of the present work is to improve the above mentioned algorithm by including formal stressing rules for nouns and adjectives. By means of these rules such words as diminutives, names and degrees of adjectives that are not present in the dictionary may be stressed. The work analyses when it is more convenient to formulate rules manually and when to generate them automatically. A method for formulating rules manually has been described and a set of such rules has been presented. Besides the algorithm for generating stressing rules with the help of a dictionary of stems of nouns and adjectives has been given.
Journal:Informatica
Volume 11, Issue 1 (2000), pp. 19–40
Abstract
The paper deals with one of the components of text-to-speech synthesis of the Lithuanian language, namely – automatic text stressing. The present work substantiates the necessity to divide words into fixed and variable parts used to build different grammatical forms, as well as to store only those parts rather than the whole worlds in the dictionary. According to the inflexion method, all words of the Lithuanian language are divided into three groups (noun-adjectives, verbs and non-inflectional words) and each group is analysed separately. The type of information, as well as the form in which it is to be stored, has been established for each group and the algorithm by means of which the grammatical form of a word can be recognised and stressed, has been presented.