Transformer-Based Detection of Propaganda Techniques in a Low-Resource Language: A Case Study in Lithuanian

Rizgelienė, Ieva; Zaranka, Paulius; Korvel, Gražina; Marcinkevičius, Virginijus

doi:10.15388/26-INFOR633

Informatica

Transformer-Based Detection of Propaganda Techniques in a Low-Resource Language: A Case Study in Lithuanian

Ieva Rizgelienė Paulius Zaranka Gražina Korvel Virginijus Marcinkevičius

https://doi.org/10.15388/26-INFOR633

Pub. online: 12 June 2026 Type: Research Article

Open Access

Received
1 April 2026

Accepted
1 June 2026

Published
12 June 2026

Abstract

Propaganda techniques are a key tool for creating misleading content, often disseminated in native languages to increase their impact. Therefore, it is increasingly important to develop detection models not only for high-resource languages but also for low-resource languages, which still face significant limitations in propaganda detection. This study presents the first approach to automated propaganda technique detection in Lithuanian using the HALT-PROP corpus. We adapt the standard framework to account for frequent overlap between techniques. Experiments with the Lithuanian transformer LT-MLKM-modernBERT show that BILOU tagging improves span identification, while sentence classification based on span-level information enhances technique detection for most techniques. The results also indicate that training separate binary classifiers is more effective than multi-label classification in this setting. Overall, the proposed approach outperforms GPT-5.3 on most techniques and provides a strong baseline for propaganda technique detection in Lithuanian.

References

Alam, F., Mubarak, H., Zaghouani, W., Da San Martino, G., Nakov, P. (2022). Overview of the WANLP 2022 shared task on propaganda detection in arabic. In: Bouamor, H., Al-Khalifa, H., Darwish, K., Rambow, O., Bougares, F., Abdelali, A., Tomeh, N., Khalifa, S., Zaghouani, W. (Eds.), Proceedings of the Seventh Arabic Natural Language Processing Workshop (WANLP). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, pp. 108–118. https://doi.org/10.18653/v1/2022.wanlp-1.11.

Barrón-Cedeño, A., Jaradat, I., Da San Martino, G., Nakov, P. (2019). Proppy: organizing the news based on their propagandistic content. Information Processing & Management, 56(5), 1849–1864. https://doi.org/10.1016/j.ipm.2019.03.005. https://www.sciencedirect.com/science/article/pii/S0306457318306058.

Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., Grave, E., Ott, M., Zettlemoyer, L., Stoyanov, V. (2020). Unsupervised cross-lingual representation learning at scale. In: Jurafsky, D., Chai, J., Schluter, N., Tetreault, J. (Eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 8440–8451. https://doi.org/10.18653/v1/2020.acl-main.747.

Da San Martino, G., Yu, S., Barrón-Cedeño, A., Petrov, R., Nakov, P. (2019). Fine-grained analysis of propaganda in news articles. In: Inui, K., Jiang, J., Ng, V., Wan, X. (Eds.), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, pp. 5636–5646. https://doi.org/10.18653/v1/D19-1565.

Da San Martino, G., Barrón-Cedeño, A., Wachsmuth, H., Petrov, R., Nakov, P. (2020). SemEval-2020 Task 11: detection of propaganda techniques in news articles. In: Herbelot, A., Zhu, X., Palmer, A., Schneider, N., May, J., Shutova, E. (Eds.), Proceedings of the Fourteenth Workshop on Semantic Evaluation. International Committee for Computational Linguistics, Barcelona, pp. 1377–1414. https://doi.org/10.18653/v1/2020.semeval-1.186.

Dimitrov, D., Bin Ali, B., Shaar, S., Alam, F., Silvestri, F., Firooz, H., Nakov, P., Da San Martino, G. (2021). SemEval-2021 Task 6: detection of persuasion techniques in texts and images. In: Palmer, A., Schneider, N., Schluter, N., Emerson, G., Herbelot, A., Zhu, X. (Eds.), Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021). Association for Computational Linguistics, pp. 70–98. https://doi.org/10.18653/v1/2021.semeval-1.7.

Dimitrov, D., Alam, F., Hasanain, M., Hasnat, A., Silvestri, F., Nakov, P., Da San Martino, G. (2024). SemEval-2024 Task 4: multilingual detection of persuasion techniques in memes. In: Ojha, A.K., Doğruöz, A.S., Tayyar Madabushi, H., Da San Martino, G., Rosenthal, S., Rosá, A. (Eds.), Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024). Association for Computational Linguistics, Mexico City, Mexico, pp. 2009–2026. https://doi.org/10.18653/v1/2024.semeval-1.275.

Hasanain, M., Hasan, M.A., Ahmad, F., Suwaileh, R., Biswas, M.R., Zaghouani, W., Alam, F. (2024). ArAIEval shared task: propagandistic techniques detection in unimodal and multimodal arabic content. In: Habash, N., Bouamor, H., Eskander, R., Tomeh, N., Abu Farha, I., Abdelali, A., Touileb, S., Hamed, I., Onaizan, Y., Alhafni, B., Antoun, W., Khalifa, S., Haddad, H., Zitouni, I., AlKhamissi, B., Almatham, R., Mrini, K. (Eds.), Proceedings of the Second Arabic Natural Language Processing Conference. Association for Computational Linguistics, Bangkok, Thailand, pp. 456–466. https://doi.org/10.18653/v1/2024.arabicnlp-1.44.

Horák, A., Sabol, R., Herman, O., Baisa, V. (2024). Recognition of propaganda techniques in newspaper texts: fusion of content and style analysis. Expert Systems with Applications, 251, 124085. https://doi.org/10.1016/j.eswa.2024.124085. https://www.sciencedirect.com/science/article/pii/S0957417424009515.

Jose, J., Geeng, C., Morales, K.O., McCoy, D., Greenstadt, R. (2025). What’s in a label? Propaganda labels and user sharing behavior on social media platforms. Proceedings of the International AAAI Conference on Web and Social Media, 19(1), 918–934. https://doi.org/10.1609/icwsm.v19i1.35853.

Moral, P., Marco, G., Gonzalo, J., Carrillo-de-Albornoz, J., Gonzalo-Verdugo, I. (2023). Overview of DIPROMATS 2023: automatic detection and characterization of propaganda techniques in messages from diplomats and authorities of world powers. Procesamiento del Lenguaje Natural, 71, 397–407. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6569.

Moral, P., Fraile, J.M., Marco, G., Peñas, A., Gonzalo, J. (2024). Overview of DIPROMATS 2024: detection, characterization and tracking of propaganda in messages from diplomats and authorities of world powers. Procesamiento del Lenguaje Natural, 73, 347–358.

Perišić, A., Vanbelle, S., Petričević, R.B. (2025). Quantifying binary classifier algorithms similarity with a consensus agreement approach. Informatica, 36(3), 657–676. https://doi.org/10.15388/25-INFOR601.

Piskorski, J., Stefanovitch, N., Da San Martino, G., Nakov, P. (2023). SemEval-2023 Task 3: detecting the category, the framing, and the persuasion techniques in online news in a multi-lingual setup. In: Ojha, A.K., Doğruöz, A.S., Da San Martino, G., Tayyar Madabushi, H., Kumar, R., Sartori, E. (Eds.), Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023). Association for Computational Linguistics, Toronto, Canada, pp. 2343–2361. https://doi.org/10.18653/v1/2023.semeval-1.317.

Rashkin, H., Choi, E., Jang, J.Y., Volkova, S., Choi, Y. (2017). Truth of varying shades: analyzing language in fake news and political fact-checking. In: Palmer, M., Hwa, R., Riedel, S. (Eds.), Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, pp. 2931–2937. https://doi.org/10.18653/v1/D17-1317.

Ratinov, L., Roth, D. (2009). Design Challenges and Misconceptions in Named Entity Recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL-2009). Association for Computational Linguistics, Boulder, Colorado, pp. 147–155. https://aclanthology.org/W09-1119/.

Rizgelienė, I., Zubaitienė, V., Maliukevičius, N., Marcinkevičius, V. (2025). HALT-PROP: Human-Annotated Lithuanian Textual Corpus for Propaganda Narratives and Techniques. Scientific Data, 13(1), 47. https://doi.org/10.1038/s41597-025-06367-w.

State Digital Solutions Agency (SDSA) (2025). LT-MLKM-modernBERT: Lithuanian ModernBERT Language Model. https://huggingface.co/VSSA-SDSA/LT-MLKM-modernBERT. Developed by Vytautas Magnus University (VMU), UAB Neurotechnology, UAB Tilde informacinės technologijos, MB Krilas.

Ulčar, M., Robnik-Šikonja, M. (2020). EMBEDDIA: LitLat BERT: Model Card. https://huggingface.co/EMBEDDIA/litlat-bert. XLM-RoBERTa-base configuration; 12 layers, 12 heads; vocabulary size 84,201.

Biographies

Rizgelienė Ieva

ieva.rizgeliene@mif.vu.lt

I. Rizgelienė is a PhD student at the Institute of Data Science and Digital Technologies, Vilnius University. Her primary research interests include propaganda detection and analysis, with an emphasis on low-resource languages.

Zaranka Paulius

paulius.zaranka@mif.vu.lt

P. Zaranka received his master’s degree in computer modelling from Vilnius University in 2025 and is currently a lecturer in NLP at Vilnius University. His primary research interests include large language models, natural language processing, and agent-based modelling.

Korvel Gražina

grazina.korvel@mif.vu.lt

Marcinkevičius Virginijus

virginijus.marcinkevicius@mif.vu.lt

Full article

Open access article under the CC BY license.

Keywords

propaganda technique detection low-resource language transformers

Funding

This research was supported by the Lithuanian Government Priority Research Program “Building Societal Resilience and Crisis Management in the Context of Con temporary Geopolitical Developments” (implemented through the Lithuania Research Council) under grant number S-VIS-23-8. Project title: “Propaganda and Disinformation Research: Machine Learning-Based Automatic Detection, Impact and Societal Resilience.”

Metrics

since January 2020

672

Article info
views

Full article
views

PDF
downloads

XML
downloads

RSS

Authors

Abstract

References

Biographies

Export citation

Copy and paste formatted citation

Download citation in file