Pub. online:6 Dec 2022Type:Research ArticleOpen Access
Journal:Informatica
Volume 33, Issue 4 (2022), pp. 795–832
Abstract
Intonation is a complex suprasegmental phenomenon essential for speech processing. However, it is still largely understudied, especially in the case of under-resourced languages, such as Lithuanian. The current paper focuses on intonation in Lithuanian, a Baltic pitch-accent language with free stress and tonal variations on accented heavy syllables. Due to historical circumstances, the description and analysis of Lithuanian intonation were carried out within different theoretical frameworks and in several languages, which makes them hardly accessible to the international research community. This paper is the first attempt to gather research on Lithuanian intonation from both the Lithuanian and the Western traditions, the structuralist and generativist points of view, and the linguistic and modelling perspectives. The paper identifies issues in existing research that require special attention and proposes directions for future investigations both in linguistics and modelling.
Journal:Informatica
Volume 24, Issue 3 (2013), pp. 435–446
Abstract
The performance of an automatic speech recognition system heavily depends on the used feature set. Quality of speech recognition features is estimated by classification error, but then the recognition experiments must be performed, including both front-end and back-end implementations. We propose a method for features quality estimation that does not require recognition experiments and accelerate automatic speech recognition system development. The key component of our method is usage of metrics right after front-end features computation. The experimental results show that our method is suitable for recognition systems with back-end Euclidean space classifiers.
Journal:Informatica
Volume 18, Issue 3 (2007), pp. 395–406
Abstract
This paper describes a framework for making up a set of syllables and phonemes that subsequently is used in the creation of acoustic models for continuous speech recognition of Lithuanian. The target is to discover a set of syllables and phonemes that is of utmost importance in speech recognition. This framework includes operations with lexicon, and transcriptions of records. To facilitate this work, additional programs have been developed that perform word syllabification, lexicon adjustment, etc. Series of experiments were done in order to establish the framework and model syllable- and phoneme-based speech recognition. Dominance of a syllable in lexicon has improved speech recognition results and encouraged us to move away from a strict definition of syllable, i.e., a syllable becomes a simple sub-word unit derived from a syllable. Two sets of syllables and phonemes and two types of lexicons have been developed and tested. The best recognition accuracy achieved 56.67% ±0.33. The speech recognition system is based on Hidden Markov Models (HMM). The continuous speech corpus LRN0 was used for the speech recognition experiments.
Journal:Informatica
Volume 17, Issue 4 (2006), pp. 587–600
Abstract
There is presented a technique of transcribing Lithuanian text into phonemes for speech recognition. Text-phoneme transformation has been made by formal rules and the dictionary. Formal rules were designed to set the relationship between segments of the text and units of formalized speech sounds – phonemes, dictionary – to correct transcription and specify stress mark and position. Proposed the automatic transcription technique was tested by comparing its results with manually obtained ones. The experiment has shown that less than 6% of transcribed words have not matched.
Journal:Informatica
Volume 15, Issue 4 (2004), pp. 465–474
Abstract
The development of Lithuanian HMM/ANN speech recognition system, which combines artificial neural networks (ANNs) and hidden Markov models (HMMs), is described in this paper. A hybrid HMM/ANN architecture was applied in the system. In this architecture, a fully connected three‐layer neural network (a multi‐layer perceptron) is trained by conventional stochastic back‐propagation algorithm to estimate the probability of 115 context‐independent phonetic categories and during recognition it is used as a state output probability estimator. The hybrid HMM/ANN speech recognition system based on Mel Frequency Cepstral Coefficients (MFCC) was developed using CSLU Toolkit. The system was tested on the VDU isolated‐word Lithuanian speech corpus and evaluated on a speaker‐independent ∼750 distinct isolated‐word recognition task. The word recognition accuracy obtained was about 86.7%.