An Overview of Lithuanian Intonation: A Linguistic and Modelling Perspective

Melnik-Leroy, Gerda Ana; Bernatavičienė, Jolita; Korvel, Gražina; Navickas, Gediminas; Tamulevičius, Gintautas; Treigys, Povilas

doi:10.15388/22-INFOR502

Informatica

An Overview of Lithuanian Intonation: A Linguistic and Modelling Perspective

Volume 33, Issue 4 (2022), pp. 795–832

Gerda Ana Melnik-Leroy Jolita Bernatavičienė Gražina Korvel Gediminas Navickas Gintautas Tamulevičius Povilas Treigys

https://doi.org/10.15388/22-INFOR502

Pub. online: 6 December 2022 Type: Research Article

Open Access

Received
1 May 2022

Accepted
1 November 2022

Published
6 December 2022

Abstract

Intonation is a complex suprasegmental phenomenon essential for speech processing. However, it is still largely understudied, especially in the case of under-resourced languages, such as Lithuanian. The current paper focuses on intonation in Lithuanian, a Baltic pitch-accent language with free stress and tonal variations on accented heavy syllables. Due to historical circumstances, the description and analysis of Lithuanian intonation were carried out within different theoretical frameworks and in several languages, which makes them hardly accessible to the international research community. This paper is the first attempt to gather research on Lithuanian intonation from both the Lithuanian and the Western traditions, the structuralist and generativist points of view, and the linguistic and modelling perspectives. The paper identifies issues in existing research that require special attention and proposes directions for future investigations both in linguistics and modelling.

References

Abramson, A.S. (1962). The vowels and tones of standard Thai: acoustical measurements and experiments. International Journal of American Linguistics, 28(2–3).

Alvarez, A.A., Issa, E.S.A. (2020). Learning Intonation Pattern Embeddings for Arabic Dialect Identification. arXiv:2008.00667.

Alvarez, A.A., Issa, E., Alshakhori, M. (2022). Computational modeling of intonation patterns in Arabic emotional speech. In: Proceedings Speech Prosody 2022, pp. 615–619.

Ambrazas, V., Garšva, K., Girdenis, A., Jakaitienė, E., Kniūkšta, P., Krinickaitė, S., Labutis, V., Laigonaitė, A., Oginskienė, E., Pikčilingis, J., Ružė, A., Sližienė, N., Ulvydas, K., Urbutis V., Valeckienė, A., Valiulytė, E. (1996). A Grammar of Modern Lithuanian. 2nd ed. Mokslo ir enciklopedijų leidykla, Vilnius.

Anbinderis, T. (2010a). Automatic stressing of Lithuanian text using decision trees. Information Technology and Control, 39(1), 61–67.

Anbinderis, T. (2010b). Kai kurių lietuvių kalbos teksto kirčiavimo aspektų matematinis modeliavimas [Mathematical Modelling of Some Aspects of Stressing a Lithuanian Text]. PhD thesis, Vilnius university.

Anbinderis, T., Kasparaitis, P. (2007). Algorithms for detecting clitics in the Lithuanian text. Studies about Languages, 10, 30–37.

Anbinderis, T., Kasparaitis, P. (2009). Disambiguation of Lithuanian homographs based on the frequencies of lexemes and morphological tags. Studies about Languages, 14, 25–31.

Andruski, J., Costello, J. (2004). Using polynomial equations to model pitch contour shape in lexical tones: an example from Green Mong. Journal of the International Phonetic Association, 34(2), 125–140.

Arvaniti, A. (2016). Analytical decisions in intonation research and the role of representations: lessons from Romani. Laboratory Phonology. Journal of the Association for Laboratory Phonology, 7(1), 1–43. https://doi.org/10.5334/labphon.14.

Arvaniti, A. (2022). The autosegmental-metrical model of intonational phonology. In: Shattuck-Hufnagel, S., Barnes, J. (Eds.), Prosodic Theory and Practice. MIT Press, Cambridge, MA.

Arvaniti, A., Ladd, D.R. (2009). Greek wh-questions and the phonology of intonation. Phonology, 26(1), 43–74. https://doi.org/10.1017/S0952675709001717.

Balčiūnienė, I., Simonavičienė, L. (2009). Kiekybinis klausiamųjų šnekamosios lietuvių kalbos pasakymų tyrimas [A Quantitative Study of Listening to Spoken Lithuanian Narratives]. Lietuvių kalba, 3, 272–277.

Balkevičius, J. (1963). Dabartinės lietuvių kalbos sintaksė [The Syntax of Modern Lithuanian Language]. State Publishing House of Political and Scientific Literature, Vilnius.

Balkevičius, J. (1998). Lietuvių kalbos predikatinių konstrukcijų sintaksė [Syntax of Lithuanian Predicate Constructions]. Science and Encyclopaedia Publishing Centre, Vilnius.

Beckman, M.E., Ayers, G. (1994). Guidelines for ToBI Labelling. Online MS and accompanying files. Available at http://www.ling.ohio-state.edu/phonetics/E_ToBI.

Beckman, M.E., Hirschberg, J.B., Shattuck-Hufnagel, S. (2005). The original ToBI system and the evolution of the ToBI framework. In: Jun, S.-A. (Ed.), Prosodic Typology: The Phonology of Intonation and Phrasing. Oxford Scholarship Online. https://doi.org/10.1093/acprof:oso/9780199249633.003.0002.

Besacier, L., Barnard, E., Karpov, A., Schultz, T. (2014). Automatic speech recognition for under-resourced languages: a survey. Speech Communication, 56, 85–100. https://doi.org/10.1016/j.specom.2013.07.008.

Biadsy, F., Hirschberg, J.B. (2009). Using prosody and phonotactics in Arabic dialect identification. In: Tenth Annual Conference of the International Speech Communication Association. https://doi.org/10.7916/D8HM5HRV.

Birkholz, P., Zhang, X. (2020). Accounting for microprosody in modeling intonation. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8099–8103. https://doi.org/10.1109/ICASSP40776.2020.9054149.

Blevins, J. (1993). A tonal analysis of Lithuanian nominal accent. Language, 69(2), 237–273.

Boidin, C., Boeffard, O. (2008). Modeling intonation variability with HMM for speech synthesis. In: Proceedings of Speech Prosody, Brazil, pp. 115–118.

Bolinger, D. (1964). Around the edge of language: intonation. Harvard Educational Review, 34(2), 282–296.

Botinis, A., Granström, B., Möbius, B. (2001). Developments and paradigms in intonation research. Speech Communication, 33(4), 263–296. https://doi.org/10.1016/S0167-6393(00)00060-1.

Bruce, G. (1977). Swedish Word Accents in Sentence Perspective. CWK Gleerup.

Bruce, G., Schötz, S., Granström, B. (2007). SIMULEKT – modelling Swedish regional intonation. In: Proceedings of Fonetik (Swedish Phonetics Conference)/TMH-QPSR, pp. 53–56.

Cahn, J.E. (1990). Generating Expression in Synthesized Speech. MIT Technical report.

Carlson, R., Granström, B., Nord, L. (1992). Experiments with emotive speech – acted utterances and synthesized replicas. In: Proceedings of the 2nd International Conference of Spoken Language Processing (ICSLP 92), Canada, October 12–16, pp. 671–674.

Chakrasali, S.V., Indira, K., Narasimhaiah, S.Y. (2022). Performance analysis of different intonation models in Kannada speech synthesis. Indonesian Journal of Electrical Engineering and Computer Science, 26(1), 243–252. https://doi.org/10.11591/ijeecs.v26.i1.pp243-252.

Chomsky, N. (1956). Three models for the description of language. IEEE Transactions on Information Theory, 2(3), 113–124. https://doi.org/10.1109/TIT.1956.1056813.

Connell, B.A., Hogan, J.T., Rozsypal, A.J. (1983). Experimental evidence of interaction between tone and intonation in Mandarin Chinese. Journal of Phonetics, 11(4), 337–351. https://doi.org/10.1016/s0095-4470(19)30834-4.

Cutler, A., Dahan, D., Van Donselaar, W. (1997). Prosody in the comprehension of spoken language: a literature review. Language and Speech, 40(2), 141–201. https://doi.org/10.1177/002383099704000203.

Dogil, G., Möhler, G. (1998). Phonetic invariance and phonological stability: Lithuanian pitch accents. In: 5th International Conference on Spoken Language Processing, Sydney, Australia, November 30–December 4, 1998. http://www.mirlab.org/conference_papers/International_Conference/ICSLP1998/PDF/AUTHOR/SL980206.PDF.

Dogil, G., Williams, B. (1999). The phonetic manifestation of word stress. In: van der Hulst, H. (Ed.), Word Prosodic Systems in the Languages of Europe. de Gruyter, Berlin, pp. 273–334.

Fan, Y., Qian, Y., Xie, F., Soong, F.K. (2014). TTS synthesis with bidirectional LSTM based recurrent neural networks. In: Interspeech, pp. 1964–1968.

Féry, C. (2017). Intonation and prosodic structure. (Key Topics in Phonology.) Cambridge: Cambridge University Press. Pp. xi 374. Phonology, 36(1), 171–179 https://doi.org/10.1017/S0952675719000071.

Fujisaki, H., Hirose, K. (1984). Analysis of voice fundamental frequency contours for declarative sentences of Japanese. Journal of the Acoustical Society of Japan (E), 5(4), 233–242.

Gerazov, B., Ivanovski, Z., Bilibajkić, R. (2010). Modeling Macedonian intonation for text-to-speech synthesis. In: Proceedings of the DOGS, pp. 16–18.

Girdenis, A. (2003). Theoretical Foundations of Lithuanian Phonology. Eugrimas.

Girdenis, A., Zinkevičius, Z. (1966). Dėl lietuvių kalbos tarmių klasifikacijos [Regarding the Classification of Lithuanian Dialects]. Kalbotyra, 14, 139–147. https://doi.org/10.15388/Knygotyra.1966.18940.

Grabe, E., Karpinski, M. (2003). Universal and language-specific aspects of intonation in English and Polish. In: Proceedings of the 15th International Congress of Phonetic Sciences, Vol. 39, pp. 1061–1064.

Grabe, E., Nolan, F., Farrar, K.J. (1998). IViE-A comparative transcription system for intonational variation in English. In: Fifth International Conference on Spoken Language Processing, paper 0099.

Grabe, E., Kochanski, G., Coleman, J. (2007). Connecting intonation labels to mathematical descriptions of fundamental frequency. Language and Speech, 50(3), 281–310.

Gussenhoven, C. (2004). The Phonology of Tone and Intonation. Cambridge University Press.

Gussenhoven, C. (2016). Analysis of intonation: the case of MAE-ToBI. Laboratory Phonology, 7(1), 1–35. https://doi.org/10.5334/labphon.30.

Hamlaoui, F., Żygis, M., Engelmann, J., Wagner, M. (2019). Acoustic correlates of focus marking in Czech and Polish. Language and Speech, 62(2), 358–377. https://doi.org/10.1177/0023830918773536.

Halle, M., Vergnaud, J.R. (1987). An Essay on Stress. The MIT Press.

Hallé, P.A., De Boysson-Bardies, B., Vihman, M.M. (1991). Beginnings of prosodic organization: intonation and duration patterns of disyllables produced by Japanese and French infants. Language and Speech, 34(4), 299–318. https://doi.org/10.1177/002383099103400401.

Hart, J.T., Collier, R., Cohen, A. (1990). A Perceptual Study of Intonation. Cambridge University Press. https://doi.org/10.1017/CBO9780511627743.

Haspelmath, M.B., Bickel, B. (2008). The Leipzig Glossing Rules: Conventions for Interlinear Morpheme-by-Morpheme Glosses. Department of Linguistics of the Max Planck Institute for Evolutionary Anthropology & the Department of Linguistics of the University of Leipzig.

Hedberg, N., Sosa, J.M., Görgülü, E., Mameni, M. (2010). The prosody and meaning of wh-questions in American English. In: Proceedings of the International Conference on Speech Prosody, pp. 6–9.

Hedberg, N., Sosa, J.M., Görgülü, E. (2017). The meaning of intonation in yes-no questions in American English: a corpus study. Corpus Linguistics and Linguistic Theory, 13(2), 321–368. https://doi.org/10.1515/cllt-2014-0020.

Hirst, D.J., Espesser, R. (1993). Automatic modelling of fundamental frequency using a quadratic spline function. Travaux de l’Institut de Phonétique d’Aix, 15, 75–85.

Hirst, D., Di Cristo, A. (1998). A survey of intonation systems. In: Hirst, D., Di Cristo, A. (Eds.), Intonation Systems: A Survey of Twenty Languages. Cambridge University Press, New York, pp. 1–44.

Hock, H.H. (2015). Prosody and dialectology of tonal shifts in Lithuanian and their implications. In: Contemporary Approaches to Baltic Linguistics, Vol. 276, pp. 111–137. https://doi.org/10.1515/9783110343953-003.

Hodari, Z., Watts, O., King, S. (2019). Using generative modelling to produce varied intonation for speech synthesis. In: Proceedings of the 10th ISCA Speech Synthesis Workshop, pp. 239–244. https://doi.org/10.21437/SSW.2019-43.

Hodari, Z., Moinet, A., Karlapati, S., Lorenzo-Trueba, J., Merritt, T., Joly, A., Abbas, A., Karanasou, P., Drugman, T. (2021). Camp: a two-stage approach to modelling prosody in context. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6578–6582. https://doi.org/10.1109/ICASSP39728.2021.9414413.

Honnet, P.E.J.C. (2017). Intonation Modelling for Speech Synthesis and Emphasis Preservation. Thesis No. 7520, École Polytechnique Fédérale De Lausanne.

Honnet, P.-E., Garner, P.N. (2016). Emphasis recreation for TTS using intonation atoms. In: Proccedings of 9th ISCA Workshop on Speech Synthesis Workshop (SSW 9), pp. 14–20. https://doi.org/10.21437/SSW.2016-3.

Hualde, J.I., Riad, T. (2014). Word accent and intonation in Baltic. In: Proceedings of the International Conference on Speech Prosody, pp. 668–672. https://doi.org/10.21437/speechprosody.2014-121.

Joseph, J.E. (2009). Why Lithuanian accentuation mattered to Saussure. Language & History, 52(2), 182–198. https://doi.org/10.1179/175975309X452067.

Hyman, L.M. (2006). Word-prosodic typology. Phonology, 23(2), 225–257. https://doi.org/10.1017/S0952675706000893.

Janyoi, P., Seresangtakul, P. (2020). Tonal contour generation for Isarn speech synthesis using deep learning and sampling-based F0 representation. Applied Sciences, 10(18), 6381. https://doi.org/10.3390/app10186381.

Jun, S.A. (Ed.) (2005). Prosodic Typology: The Phonology of Intonation and Phrasing. OUP Oxford. https://doi.org/10.1093/acprof:oso/9780199249633.001.0001.

Jusczyk, P.W., Hirsh-Pasek, K., Kemler Nelson, D.G., Kennedy, L.J., Woodward, A., Piwoz, J. (1992). Perception of acoustic correlates of major phrasal units by young infants. Cognitive Psychology, 24(2), 252–293.

Kasparaitis, P. (2000). Automatic stressing of the Lithuanian text on the basis of a dictionary. Informatica, 11(1), 19–40. https://doi.org/10.3233/INF-2000-11103.

Kasparaitis, P. (2001). Automatic stressing of the Lithuanian nouns and adjectives on the basis of rules. Informatica, 12(2), 315–336. https://doi.org/10.3233/INF-2001-12210.

Kasparaitis, P. (2005). Diphone databases for Lithuanian text-to-speech synthesis. Informatica, 16(2), 193–202. https://doi.org/10.15388/informatica.2005.093.

Kasparaitis, P., Beniušė, M. (2016). Automatic parameters estimation of the D. Klatt phoneme duration model. Informatica, 27(3), 573–586. https://doi.org/10.15388/Informatica.2016.100.

Kazlauskienė, A. (2012). Bendrinės lietuvių kalbos akcentologijos pagrindai [Fundamentals of the Standard Lithuanian Language Accentology]. Vytautas Magnus University Press, Kaunas.

Kazlauskienė, A., Dereškevičiūtė, S. (2018). The intonational patterns of interrogative sentences in Lithuanian. In: Salento University Linguistic Symposium.

Kazlauskienė, A., Sabonytė, R. (2018). F0 in Lithuanian: the indicator of stress, syllable accent, or intonation? In: Muischnek, K., Müürisep, K. (Eds.), Human Language Technologies – The Baltic Perspective, Vol. 307, pp. 55–62. https://doi.org/10.3233/978-1-61499-912-6-55.

Klatt, D.H. (1979). Synthesis by rule of segmental durations in English sentences. In: Lindblom, B., Ohman, S. (Eds.), Frontiers of Speech Communication Research. Academic Press, New York, pp. 287–300.

Kohler, K.J. (1991). Prosody in speech synthesis: the interplay between basic research and TTS application. Journal of Phonetics, 19, 121–138.

Krapikaitė, N. (2009). Pagrindiniai fonetiniai Lietuvių kalbos intonacijos požymiai [The Main Phonetic Features of Intonation in Lithuanian]. MA thesis, Vytautas Magnus University.

Krapikaitė, N. (2011). Pagrindinis tonas – svarbiausias funkcinių frazių skiriamasis požymis [Pitch as the Main Distinguishing Feature of Functional Phrases]. In: Laikas ir žodis: studentų mokslo darbai. Vytautas Magnus University Press, pp. 55–60.

Krapikaitė, N. (2014). Using the ToBI transcription to analyze the intonation of Lithuanian. In: Human Language Technologies – The Baltic Perspective: Proceedings of the 6th International Conference, Baltic HLT 2014, pp. 202–205.

Krapikaitė, N. (2015). The adaptation of the ToBI system for Lithuanian pitch change. Žmogus ir žodis, 17(1), 67–75. https://doi.org/10.15823/zz.2015.5.

Kuczmarski, T. (2021). Modeling of Polish Intonation for Statistical-Parametric Speech Synthesis. PhD Thesis, Institute of Ethnolinguistics.

Kundrotas, G. (2008). Lietuvių kalbos intonacinių kontūrų fonetiniai požymiai (eksperimentinis-fonetinis tyrimas) [Acoustic Characteristics of Lithuanian Intonation Contours (Experimental Phonetic Research)]. Žmogus ir žodis. Didaktinė Lingvistika, 10(1), 43–55.

Kundrotas, G. (2009). Lyginamoji lietuvių ir rusų kalbų intonacinių sistemų analizė [Comparative Analysis of Lithuanian and Russian Intonation Systems]. Vilniaus pedagoginio universiteto leidykla, Vilnius.

Kundrotas, G. (2017). Lietuvių kalbos intonacijos tyrimo apžvalga [A Review of Research into the Intonation of the Lithuanian Language]. Lituanistica, 4(4), 245–254.

Kundrotas, G. (2020). Lithuanian language intonation: history of research, in the context of language intonology. Językoznawstwo, 1(14), 195–204. https://doi.org/10.25312/2391-5137.14/2020_12gk.

Kushnir, Y. (2019). Prosodic Patterns in Lithuanian Morphology. PhD thesis, Universität Leipzig.

Ladd, D.R. (1996). Intonational Phonology. Cambridge University Press, Cambridge.

Ladd, D.R. (2001). Intonational universals and intonational typology. In: Language Typology and Language Universals: An International Handbook, pp. 1380–1390.

Ladd, D.R. (2008). Intonational Phonology. 2nd ed. Cambridge University Press, Cambridge.

Laigonaitė, A. (1958). Dėl lietuvių kalbos kirčio ir priegaidės supratimo [On the Understanding of the Stress and Pitch-Accent of the Lithuanian Language]. Kalbotyra I, 23–27.

Leonavičius, R. (2006). Melisma Synthesis Using Artificial Neural Networks. PhD thesis, Vilnius Gediminas Technical University.

Lezhenin, I., Diachkov, V., Lamtev, A., Zhuikov, A., Bogach, N., Boitsova, E., Pyshkin, E. (2018). Automatic intonation-based keyword extraction from academic discourse. In: 2018 Federated Conference on Computer Science and Information Systems (FedCSIS). IEEE, pp. 165–168.

Ljungqvist, M., Fujisaki, H. (1993). Generating intonation for Swedish text-to-speech conversion using a quantitative model for the F0 contour. In: Proceedings of the 3rd European Conference on Speech Communication and Technology EUROSPEECH, pp. 873–876.

Liberman, M. (1975). The Intonational System of English. PhD thesis, MIT, Cambridge/MA.

Lieberman, P. (1965). On the acoustic basis of the perception of intonation by linguists. WORD, 21(1), 40–54. https://doi.org/10.1080/00437956.1965.11435417.

Liberman, M., Sag, I. (1974). Prosodic form and discourse function. In:Papers from the Tenth Regional Meeting, CLS 10, pp. 416–427.

Liu, H. (2017) Fundamental Frequency Modelling: An Articulatory Perspective with Target Approximation and Deep Learning. PhD thesis, University College London.

Selkirk, E.O. (1978). On prosodic structure and its relation to syntactic structure. In: Fretheim, T. (Ed.), Nordic Prosody II. TAPIR, Trondheim, pp. 268–271.

Malisz, Z., Żygis, M. (2017). Special issue: slavic perspectives on prosody. Phonetica, 73(3–4), 155–162. https://doi.org/10.1159/000449430.

Marelli, F., Schnell, B., Bourlard, H., Dutoit, T., Garner, P.N. (2019). An end-to-end network to synthesize intonation using a generalized command response model. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7040–7044. https://doi.org/10.1109/ICASSP.2019.8683815.

Michelini, G. (2000). Le lituanien, la plus archaïque des langues indo-européennes modernes. Cahiers Lituaniens, 2, 28–36.

Mikalauskaitė, E. (1975). Lietuvių kalbos fonetikos darbai [Studies in Lithuanian Phonetics]. Mokslas, Vilnius.

Mikoś, M.J. (1976). Intonation of questions in Polish. Journal of Phonetics, 4(3), 247–253. https://doi.org/10.1016/s0095-4470(19)31247-1.

Ming, H., Huang, D.-Y., Xie, L., Wu, J., Dong, M., Li, H. (2016). Deep bidirectional LSTM modeling of timbre and prosody for emotional voice conversion. In: Interspeech, pp. 2453–2457.

Mixdorff, H. (2000). A novel approach to the fully automatic extraction of Fujisaki model parameters. In: 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing (Cat. No. 00CH37100), Vol. 3, pp. 1281–1284, IEEE. https://doi.org/10.1109/ICASSP.2000.861811.

Moberg, M., Parssinen, K. (2004). Comparing CART and Fujisaki intonation models for synthesis of US-English names. In: Speech Prosody 2004, pp. 439–442.

Navickas, G., Korvel, G., Bernatavičienė, J. (2019). Overview of speech synthesis using LSTM neural networks. In: Computer Data Analysis and Modeling: Stochastics and Data Science: Proceedings of the Twelfth International Conference, Minsk, September 18–22, 2019. BSU, Minsk, pp. 257–261.

Ni, J., Shiga, Y., Hori, C. (2016). Superpositional HMM-based intonation synthesis using a functional F0 model. Journal of Signal Processing Systems, 82, 273–286.

Norkevičius, G., Kazlauskienė, A., Raškinis, G. (2006). Garsų trukmės modeliavimas naudojant klasifikavimo ir regresijos medžius [Decision Trees in Phoneme’s Duration Modelling]. In: Informacinės technologijos 2006, pp. 82–85.

Norkevičius, G., Raškinis, G. (2008). Modeling phone duration of Lithuanian by classification and regression trees, using very large speech corpus. Informatica, 19(2), 271–284. https://doi.org/10.15388/informatica.2008.213.

Oord, van den Dieleman S, A., Zen, H., Simonyan, K., Vinyals, O., Graves, Kalchbrenner N, A., Senior, A., Kavukcuoglu, K. (2016). Wavenet: a generative model for raw audio. arXiv:1609.03499.

Pakerys, A. (2003). Lietuvių bendrinės kalbos fonetika [Phonetics of Lithuanian Language]. 3rd ed. Encyclopedia, Vilnius.

Paulikas, Š., Navakauskas, D. (2005). Restoration of voiced speech signals preserving prosodic features. Speech Communication, 47(4), 457–468. https://doi.org/10.1016/j.specom.2005.05.002.

Petit, D. (2020). New insights on Lithuanian accentuation from the unpublished manuscripts of Ferdinand de Saussure (1857–1913). Baltic Linguistics, 1, 146–166. https://doi.org/10.32798/bl.438.

Pierrehumbert, J. (1980). The Phonetics and Phonology of English Intonation. PhD thesis, Massachusetts Institute of Technology.

de Pijper, J.R. (1983). Modeling British English Intonation. Foris, Dordrecht.

Pyž, G., Šimonytė, V., Slivinskas, V. (2011). Joining of vowel and semivowel models in Lithuanian speech formant-based synthesizer. In: Proceedings of the 6th International Conference on Electrical and Control Technologies, pp. 114–119.

Pyž, G., Šimonytė, V., Slivinskas, V. (2014). Developing models of Lithuanian speech vowels and semivowels. Informatica, 25(1), 55–72.

Radzevičius, A., Raudys, A., Kasparaitis, P. (2021). Speech synthesis using stressed sample labels for languages with higher degree of phonemic orthography. In: International Conference on Information and Software Technologies. Springer, Cham, pp. 378–387.

Raitio, T., Rasipuram, R., Castellani, D. (2020). Controllable neural text-to-speech synthesis using intuitive prosodic features. In: Interspeech. https://doi.org/10.48550/arXiv.2009.06775.

Raitio, T., Li, J., Seshadri, S. (2022). Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7587–7591. https://doi.org/10.1109/ICASSP43922.2022.9746253.

Rao, K.S., Yegnanarayana, B. (2009). Intonation modeling for Indian languages. Computer Speech and Language, 23(2), 240–256. https://doi.org/10.1016/j.csl.2008.06.005.

Raškinis, G., Kazlauskienė, A. (2013). From speech corpus to intonation corpus: clustering phrase pitch contours of Lithuanian. In: Oepen, S., Hagen, K., Johannesse, J.B. (Eds.), NODALIDA 2013: Proceedings of the 19th Nordic Conference of Computational Linguistics, May 22–24, 2013, pp. 353–363, Oslo University, Linköping University Electronic Press.

Reddy, V.R., Rao, K.S. (2011). Intonation modeling using FFNN for syllable based Bengali text to speech synthesis. In: 2nd International Conference on Computer and Communication Technology (ICCCT-2011), pp. 334–339. https://doi.org/10.1109/ICCCT.2011.6075155.

Ronanki, S., Henter, G.E., Wu, Z., King, S. (2016). A template-based approach for speech synthesis intonation generation using LSTMs. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 2463–2467. https://doi.org/10.21437/Interspeech.2016-96.

Rossi, P.S., Palmieri, F., Cutugno, F. (2002). A method for automatic extraction of Fujisaki-model parameters. In: Speech Prosody 2002, International Conference, pp. 615–618.

de Saussure, F. (1879). Mémoire sur le système primitif des voyelles dans les langues indo-européennes (Leipzig: printed Teubner). Teubner, Leipzig. Repr. in Saussure, 1922, 1–268.

de Saussure, F. (1894). À propos de l’accentuation lituanienne (intonations et accent proprement dit). Mémoires de la Société de linguistique de Paris, 8, 425–446.

de Saussure, F. (1896). Accentuation lituanienne. Indogermanische Forschungen, Anzeiger, 6, 157–166.

Savičiūtė, E., Ambridge, B., Pine, J.M. (2018). The roles of word-form frequency and phonological neighbourhood density in the acquisition of Lithuanian noun morphology. Journal of Child Language, 45(3), 641–672. https://doi.org/10.1017/S030500091700037X.

Siniova, O., Kundrotas, G. (2014). Lietuvių kalbos tartis: garsai, žodis, intonacija [Pronunciation in Lithuanian Language: Phonemes, Words, Intonation]. Lithuanian University of Educational Sciences.

Snow, D., Balog, H.L. (2002). Do children produce the melody before the words? A review of developmental intonation research. Lingua, 112(12), 1025–1058. https://doi.org/10.1016/S0024-3841(02)00060-8.

Stehwien, S., Schweitzer, A., Vu, N.T. (2020). Acoustic and temporal representations in convolutional neural network models of prosodic events. Speech Communication, 125, 128–141.

Sun, G., Zhang, Y., Weiss, R.J., Cao, Y., Zen, H., Wu, Y. (2020). Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis. In: Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6264–6268. https://doi.org/10.1109/ICASSP40776.2020.9053520.

Suni, A., Aalto, D., Raitio, T., Alku, P., Vainio, M. (2013). Wavelets for intonation modeling in HMM speech synthesis. In: 8th ISCA Workshop on Speech Synthesis, Vol. 1, pp. 285–290.

Syrdal, A.K., Möhler, G., Dusterhoff, K., Conkie, A., Black, A.W. (1998). Three methods of intonation modeling. In: Proceedings of the 3rd Esca/Cocosda Workshop on Speech Synthesis, pp. 305–310.

Talandienė, M. (1970). Alternatyvinių klausimų loginiai ir komunikaciniai santykiai [Logical and Communication Relations in Alternative Questions]. Kalbos garsai ir intonacija. Vilnius.

Taylor, P. (1994). The rise/fall/connection model of intonation. Speech Communication, 15(1–2), 169–186.

Taylor, P. (1998). The tilt intonation model. In: Proceedings of the 5th International Conf. on Spoken Language Processing (ICSLP 98). https://doi.org/10.21437/ICSLP.1998-153.

Text-Talk (2006). http://www.text-talk.com/lt/apie-kalbos-sinteze.html.

Tokuda, K., Zen, H., Black, A.W. (2002). An HMM-based speech synthesis system applied to English. In: Proceedings of 2002 IEEE Workshop on Speech Synthesis. IEEE, pp. 227–230.

Trubetzkoy, N.S. (1969). Principles of Phonology. University of California Press.

Tseng, C.Y., Pin, S.H., Lee, Y., Wang, H.M., Chen, Y.C. (2005). Fluent speech prosody: framework and modeling. Speech Communication, 46(3–4), 284–309. https://doi.org/10.1016/j.specom.2005.03.015.

Ulvydas, K. (Ed.) (1965). Lietuvių kalbos gramatika / Lithuanian Grammar 1. Mintis, Vilnius.

Vaičiūnas, A., Raškinis, G., Kazlauskienė, A. (2016). Corpus-based hidden Markov modelling of the fundamental frequency of Lithuanian. Informatica, 27(3), 673–688. https://doi.org/10.15388/Informatica.2016.105.

Venditti, J.J. (2005). The J ToBI model of Japanese intonation. In: Jun, S.A. (Ed.), Prosodic Typology: The Phonology of Intonation and Phrasing. Oxford Unviersity Press, pp. 172–200.

Wang, D., Zheng, T.F. (2015). Transfer learning for speech and language processing. In: 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA). IEEE, pp. 1225–1237. https://doi.org/10.1109/APSIPA.2015.7415532.

Xu, Y. (2004). Transmitting tone and intonation simultaneously-the parallel encoding and target approximation (PENTA) model. In: Proceedings of the First International Symposium on Tonal Aspects of Languages (TAL 2004), pp. 215–220.

Xu, Y. (2015). Speech prosody: theories, models, and analysis. In: Meireles, A.R. (Ed.), Courses in Speech Prosody, pp. 146–177.

Yngve, V.H. (1954). Language as an error correcting code. In: Quarterly Progress Report of the Research Laboratory of Electronics. MIT, Cambridge, MA, pp. 35–36.

Zen, H., Sak, H. (2015). Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4470–4474.

Zerbian, S. (2010). Developments in the study of intonational typology. Linguistics and Language Compass, 4(9), 874–889. https://doi.org/10.1111/j.1749-818X.2010.00233.x.

Biographies

Melnik-Leroy Gerda Ana

gerda.melnik@mif.vu.lt

G. A. Melnik-Leroy is a researcher at the Cognitive Computing Group at the Institute of Data Science and Digital Technologies (Vilnius University). She holds a doctoral degree in cognitive science from Ecole Normale Superieure, CNRS, EHESS (Paris), one of the world’s leading institutions in the field. Her research is mainly focused on topics in cognitive psychology and psycholinguistics, including speech processing, the mental lexicon and language acquisition. The researcher also works on practical applications of findings from cognitive psychology to other fields, such as speech technologies, operational research and educational technologies. She has won several international research grants and successfully led research projects.

Bernatavičienė Jolita

J. Bernatavičienė graduated from the Vilnius Pedagogical University, in 2004 and received MS degree in informatics. In 2008, she received the doctoral degree in computer science from Institute of Mathematics and Informatics jointly with Vilnius Gediminas Technical University. She is a senior researcher at Institute of Data Science and Digital Technologies of Vilnius University. Her research interests include data bases, data mining, neural networks, image analysis, visualization, decision support systems and Internet technologies. She supervises 2 PhD students and has written more than 60 articles, 15 of which are in CA WoS database.

Korvel Gražina

G. Korvel received the BS degree in mathematics and the MS degree in informatics from Vilnius Pedagogical University (recently Vytautas Magnus University Education Academy), Lithuania, in 2007 and 2009, respectively, and the PhD degree from the Institute of Data Science and Digital Technologies, Vilnius University, in 2013. She is currently a senior researcher with the Institute of Data Science and Digital Technologies. Her research interests include speech signal processing, natural language processing, development of mathematical models, applications of soft computing, and computational intelligence. The main scientific results have been published in more than 30 papers and discussed at more than 40 national and international conferences. Some of her works received the Diploma for the Best Presentation. She is a three-time winner of the Lithuanian Academy of Sciences Young Scientist Award. She received acknowledgment from the Prime Minister of Lithuania for her obtained scientific results, in 2013 and 2019. G. Korvel took part in 4 research projects and 3 COST actions. She is a reviewer of many scientific journals, a member of the editorial board of The Journal of Intelligent Information Systems, and has been the session organizer at international conferences.

Navickas Gediminas

G. Navickas graduated from Vilnius Gediminas Technical University: in 2000 received BS degree in engineering informatics and in 2002 MS degree in statistics. He works at the Institute of Data Science and Digital Technologies of Vilnius University. His research interests include automatic Lithuanian speech recognition, Lithuanian speech synthesis, deep neural networks, speech signal processing, speech recognition methods and algorithms, speech interface applications in different fields, robotics.

Tamulevičius Gintautas

G. Tamulevičius is a senior researcher at the Institute of Data Science and Digital Technologies (Vilnius University). His research interests include the analysis and modelling of speech signals, the digital processing of speech signals, and the applications of acoustic analysis of speech signals. Current activities include academic research, administration of the study process, and teaching students.

Treigys Povilas

P. Treigys is a professor at the Faculty of Mathematics and Informatics at Vilnius university. He is a principal researcher and the head of the Signal and Image Analysis group at Vilnius University Institute of Data Science and Digital Technologies. His interests include image analysis, detection and object feature extraction in image processing, automated image objects segmentation, optimization methods, artificial neural networks, and software engineering. Povilas Treigys is a reviewer of the journals Informatica, Sensors, Nonlinear Analysis, and The Baltic Journal of Modern Computing, recently was invited to the editorial board of the conference DAMSS (Data Analysis Methods for Software Systems). He has supervised 1 postdoctoral and 6 PhD students and written more than 70 articles, 27 of which are in CA WoS database. He was the leader of the Lithuanian work group of 2 international projects.

Full article Related articles Cited by

Open access article under the CC BY license.

Keywords

intonation stress Pitch accent intonation modelling speech recognition Lithuanian under-resourced language

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Metrics

since January 2020

1338

Article info
views

1477

Full article
views

549

PDF
downloads

145

XML
downloads

RSS

Authors

Abstract

References

Biographies

Export citation

Copy and paste formatted citation

Download citation in file