Informatica logo


Login Register

  1. Home
  2. To appear
  3. Evaluation of Lithuanian Speech-to-Text ...

Informatica

Information Submit your article For Referees Help ATTENTION!
  • Article info
  • Full article
  • Related articles
  • More
    Article info Full article Related articles

Evaluation of Lithuanian Speech-to-Text Transcribers
Pijus Kasparaitis  

Authors

 
Placeholder
https://doi.org/10.15388/25-INFOR591
Pub. online: 16 April 2025      Type: Research Article      Open accessOpen Access

Received
1 July 2024
Accepted
1 April 2025
Published
16 April 2025

Abstract

For more than two decades, Lithuanian speech recognition has been researched solely in Lithuania due to the need for deep knowledge of Lithuanian. AI advancements now allow high-quality speech-to-text systems to be built without native knowledge, given sufficient annotated data is available. This study evaluated as many as 18 Lithuanian speech transcribers using a small piece of recording; 7 best ones were selected and evaluated using extensive data. The top system achieved a WER of 5.1% for Lithuanian words, with three others showing 8.7–9.2%. For other word-size tokens, such as numbers, speech disfluencies, abbreviations, foreign words, a classification adapted to the Lithuanian language was proposed. Different processing strategies for tokens of these classes were examined and it was assessed which transcribers tend to follow which strategies.

References

 
Cumbal, R., Moell, B., Lopes, J., Engwall, O. (2021). “You don’t understand me!”: comparing ASR results for L1 and L2 speakers of Swedish. In: INTERSPEECH 2021, pp. 4463–4467. https://doi.org/10.21437/Interspeech.2021-2140.
 
Errattahi, R., El Hannani, A., Ouahmane, H. (2018). Automatic speech recognition errors detection and correction: a review. Procedia Computer Science, 128, 32–37. https://doi.org/10.1016/j.procs.2018.03.005.
 
Fadel, W., Toumi, B., Buvet, P.-A., Bourja, O. (2023). Adapting off-the-shelf speech recognition systems for novel words. Information (Switzerland), 14, 179. https://doi.org/10.3390/info14030179.
 
Georgila, K., Leuski, A., Yanov, V., Traum, D. (2020). Evaluation of off-the-shelf speech recognizers across diverse dialogue domains. In: Proceedings of the Twelfth Language Resources and Evaluation Conference, pp. 6469–6476. https://aclanthology.org/2020.lrec-1.797/.
 
Hui Jae, Y., Oh, E.-B., Kim, J.-M. (2023). Comparison of automatic speech recognition system for school-aged children’s narratives: naver clova speech and google speech-to-text. Communication Sciences & Disorders, 28, 30–38. https://doi.org/10.12963/csd.23952.
 
Iancu, B. (2019). Evaluating google speech-to-text API’s performance for Romanian e-learning resources. Informatica Economica, 23, 17–25. https://doi.org/10.12948/issn14531305/23.1.2019.02.
 
Kasparaitis, P. (2008). Lithuanian speech recognition using the English recognizer. Informatica, 19(4), 505–516. https://doi.org/10.15388/Informatica.2008.227.
 
Kobylyukh, L., Rybchak, Z., Basystiuk, O. (2023). Analyzing the accuracy of speech-to-text APIs in transcribing the Ukrainian language. In: CEUR Workshop Proceedings, Vol. 3396, pp. 217–227. https://ceur-ws.org/Vol-3396/paper18.pdf.
 
Kuligowska, K., Stanusch, M., Koniew, M. (2023). Challenges of automatic speech recognition for medical interviews – research for Polish language. Procedia Computer Science, 225, 1134–1141. https://doi.org/10.1016/j.procs.2023.10.101.
 
Laurinčiukaitė, S., Telksnys, L., Kasparaitis, P., Kliukienė, R., Paukštytė, V. (2018). Lithuanian speech corpus Liepa for development of human-computer interfaces working in voice recognition and synthesis mode. Informatica, 29(3), 487–498. https://doi.org/10.15388/Informatica.2018.177.
 
Lipeika, A., Lipeikienė, J., Telksnys, L. (2002). Development of isolated word speech recognition system. Informatica, 13(1), 37–46. https://doi.org/10.3233/INF-2002-13103.
 
Maskeliunas, R., Rudzionis, A., Ratkevicius, K., Rudzionis, V. (2009). Investigation of foreign languages models for Lithuanian speech recognition. Elektronika ir Elektrotechnika, 91(3), 15–20. https://eejournal.ktu.lt/index.php/elt/article/view/10271.
 
McCowan, I., Moore, D., Dines, J., Gatica-Perez, D., Flynn, M., Wellner, P., Bourlard, H. (2004). On the Use of Information Retrieval Measures for Speech Recognition Evaluation. IDIAP Research Report 04-73. IDIAP Research Institute. https://publications.idiap.ch/downloads/reports/2004/rr04-73.pdf.
 
Pipiras, L., Maskeliunas, R., Damaševičius, R. (2019). Lithuanian speech recognition using purely phonetic deep learning. Computers, 8(4), 76. https://doi.org/10.3390/computers8040076.
 
Rasymas, T., Rudžionis, V. (2014). Combining multiple foreign language speech recognizers by using neural networks. In: Human Language Technologies–The Baltic Perspective, Vol. 268, pp. 33–39. https://doi.org/10.3233/978-1-61499-442-8-33.
 
Raškinis, G., Raškinienė, D. (2003). Building medium-vocabulary isolated-word Lithuanian HMM speech recognition system. Informatica, 14(1), 75–84. https://doi.org/10.15388/Informatica.2003.005.
 
Rugayan, J., Salvi, G., Svendsen, T. (2023). Perceptual and task-oriented assessment of a semantic metric for ASR evaluation. In: Proceedings of the INTERSPEECH 2023, pp. 2158–2162. https://doi.org/10.21437/Interspeech.2023-1778.
 
Salimbajevs, A., Kapociute-Dzikiene, J. (2018). General-purpose Lithuanian automatic speech recognition system. In: Proceedings of the 8th International Conference, Baltic HLT, pp. 150–157.
 
Sasindran, Z., Yelchuri, H., Rao, S., Prabhakar, T. (2023). ${H_{e}}val$: a new hybrid evaluation metric for automatic speech recognition tasks. https://doi.org/10.48550/arXiv.2211.01722.
 
Siegert, I., Sinha, Y., Jokisch, O., Wendemuth, A. (2020). Recognition performance of selected speech recognition APIs – a longitudinal study. In: Speech and Computer: 22nd International Conference, SPECOM 2020. Springer-Verlag, pp. 520–529. 978-3-030-60275-8. https://doi.org/10.1007/978-3-030-60276-5_50.
 
Silber-Varod, V., Siegert, I., Jokisch, O., Sinha, Y., Geri, N. (2021). A cross-language study of speech recognition systems for English, German, and Hebrew. Online Journal of Applied Knowledge Management, 9(1), 1–15. https://doi.org/10.36965/OJAKM.2021.9(1)1-15.
 
Sipavičius, D., Maskeliunas, R. (2016). “Google” Lithuanian speech recognition efficiency evaluation research. In: Dregvaite, G., Damasevicius, R. (Eds.), Information and Software Technologies. Springer International Publishing, Cham, pp. 602–612. 978-3-319-46253-0. https://doi.org/10.1007/978-3-319-46254-7_49.
 
Yoo, H., Seo, S., Im, S., Gim, G. (2021). The performance evaluation of continuous speech recognition based on Korean phonological rules of cloud-based speech recognition open API. International Journal of Networked and Distributed Computing, 9(1), 10–18. https://doi.org/10.2991/ijndc.k.201218.005.

Biographies

Kasparaitis Pijus
pijus.kasparaitis@mif.vu.lt

P. Kasparaitis (born in 1967) graduated from Vilnius University (Faculty of Mathematics) in 1991. In 2001, he defended his PhD thesis “Lithuanian Text-to-Speech Synthesis”. Presently, he is an associate professor at Vilnius University. His current research interests include text-to-speech synthesis, speech recognition, and other areas of computer linguistics.


Full article Related articles PDF XML
Full article Related articles PDF XML

Copyright
© 2025 Vilnius University
by logo by logo
Open access article under the CC BY license.

Keywords
speech-to-text transcription automatic speech recognition word error rate character error rate Lithuanian

Metrics
since January 2020
167

Article info
views

62

Full article
views

41

PDF
downloads

15

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

INFORMATICA

  • Online ISSN: 1822-8844
  • Print ISSN: 0868-4952
  • Copyright © 2023 Vilnius University

About

  • About journal

For contributors

  • OA Policy
  • Submit your article
  • Instructions for Referees
    •  

    •  

Contact us

  • Institute of Data Science and Digital Technologies
  • Vilnius University

    Akademijos St. 4

    08412 Vilnius, Lithuania

    Phone: (+370 5) 2109 338

    E-mail: informatica@mii.vu.lt

    https://informatica.vu.lt/journal/INFORMATICA
Powered by PubliMill  •  Privacy policy