Pub. online:1 Jan 2018Type:Research ArticleOpen Access
Journal:Informatica
Volume 29, Issue 3 (2018), pp. 487–498
Abstract
The problem of speech corpus for design of human-computer interfaces working in voice recognition and synthesis mode is investigated. Specific requirements of speech corpus for speech recognizers and synthesizers were accented. It has been discussed that in order to develop above mentioned speech corpus, it has to consist of two parts. One part of speech corpus should be presented for the needs of Lithuanian text-to-speech synthesizers, another part of speech corpus – for the needs of Lithuanian speech recognition engines. It has been determined that the part of speech corpus designed for speech recognition engines has to ensure the availability to present language specificity by the use of different sets of phonemes. According to the research results, the speech corpus Liepa, which consists of two parts, was developed. This speech corpus opens possibilities for cost-effective and flexible development of human-computer interfaces working in voice recognition and synthesis mode.