Informatica logo


Login Register

  1. Home
  2. Issues
  3. Volume 30, Issue 4 (2019)
  4. Improving Statistical Machine Translatio ...

Informatica

Information Submit your article For Referees Help ATTENTION!
  • Article info
  • Full article
  • Related articles
  • Cited by
  • More
    Article info Full article Related articles Cited by

Improving Statistical Machine Translation Quality Using Differential Evolution
Volume 30, Issue 4 (2019), pp. 629–645
Jani Dugonik   Borko Bošković   Janez Brest   Mirjam Sepesy Maučec  

Authors

 
Placeholder
https://doi.org/10.15388/Informatica.2019.222
Pub. online: 1 January 2019      Type: Research Article      Open accessOpen Access

Received
1 March 2018
Accepted
1 June 2019
Published
1 January 2019

Abstract

Machine Translation has become an important tool in overcoming the language barrier. The quality of translations depends on the languages and used methods. The research presented in this paper is based on well-known standard methods for Statistical Machine Translation that are advanced by a newly proposed approach for optimizing the weights of translation system components. Better weights of system components improve the translation quality. In most cases, machine translation systems translate to/from English and, in our research, English is paired with a Slavic language, Slovenian. In our experiment, we built two Statistical Machine Translation systems for the Slovenian-English language pair of the Acquis Communautaire corpus. Both systems were optimized using self-adaptive Differential Evolution and compared to the other related optimization methods. The results show improvement in the translation quality, and are comparable to the other related methods.

References

 
Albat, T.F. (2007). US Patent 0185235. Systems and Methods for Automatically Estimating a Translation Time.
 
Bertoldi, N., Haddow, B., Fouet, J.-B. (2009). Improved minimum error rate training in moses. ACL, 160–167.
 
Bojar, O., Chatterjee, R., Federmann, C., Haddow, B., Huck, M., Hokamp, C., Koehn, P., Logacheva, V., Monz, C., Negri, M., Post, M., Scarton, C., Specia, L., Turchi, M. (2015). Findings of the 2015 workshop on statistical machine translation. In: Proceedings of the Tenth Workshop on Statistical Machine Translation, pp. 1–46.
 
Bošković, B., Brest, J. (2016). Differential evolution for protein folding optimization based on a three-dimensional AB off-lattice model. Journal of Molecular Modeling, 1–15.
 
Bošković, B., Brest, J., Zamuda, A., Greiner, S., Žumer, V. (2011). History mechanism supported differential evolution for chess evaluation function tuning. Soft Computing – A Fusion of Foundations, Methodologies and Applications, 667–682.
 
Brest, J., Greiner, S., Bošković, B., Mernik, M., Žumer, V. (2006a). Self-Adapting Control Parameters in Differential Evolution: A comparative study on numerical benchmark problems. IEEE Transactions on Evolutionary Computation, 646–657.
 
Brest, J., Bošković, B., Greiner, S., Žumer, V., Sepesy Maučec, M. (2006b). Performance comparison of self-adaptive and adaptive differential evolution algorithms. Soft Computing – A Fusion of Foundations, Methodologies and Applications, 617–629.
 
Bungum, L., Gambäck, B. (2010). Evolutionary algorithms in NLP. In: Norwegian Artificial Intelligence Symposium, pp. 7–18.
 
Callison-Burch, C., Osborne, M., Koehn, P. (2006). Re-evaluating the role of BLEU in machine translation research. EACL, 249–256.
 
Cherry, C., Foster, G. (2012). Batch tuning strategies for statistical machine translation. In: NAACL.
 
Chiang, D., Marton, Y., Resnik, P. (2008). Online large-margin training of syntactic and structural translation features. In: EMNLP, pp. 224–233.
 
Chiang, D., Knight, K., Wang, W. (2009). 11,001 new features for statistical machine translation. In: HLT–NAACL, 218–226.
 
Clark, J., Dyer, C., Lavie, A., Smith, N. (2011). Better hypothesis testing for statistical machine translation: controlling for optimizer instability. In: Proceedings of the Association for Computational Lingustics.
 
Das, S., Suganthan, P.N. (2011). Differential evolution: a survey of the state-of-the-art. In: IEEE Transactions on Evolutionary Computation, pp. 27–54.
 
Das, S., Maity, S., Qu, B.-Y., Suganthan, P.N. (2011). Real-parameter evolutionary multimodal optimization – a survey of the state-of-the-art. Swarm and Evolutionary Computation, 71–88.
 
Das, S., Mullick, S.S., Suganthan, P.N. (2016). Recent advances in differential evolution – an updated survey. Swarm and Evolutionary Computation, 27, 1–30. https://doi.org/10.1016/j.swevo.2016.01.004.
 
Dorr, B.J., Jordan, P.W., Benoit, J.W. (1999). A survey of current paradigms in machine translation. Advances in Computers, 49, 1–68.
 
Du Bois, J.W., Chafe, W.L., Meyer, C., Thompson, S.A., Englebretson, R., Martey, N. (2005). Santa Barbara corpus of spoken American English. In: Philadelphia: Linguistic Data Consortium.
 
Dugonik, J., Bošković, B., Sepesy Maučec, M., Brest, J. (2014). The usage of differential evolution in a statistical machine translation. In: 2014 IEEE Symposium on Differential Evolution, SDE 2014, Orlando, FL, USA, December 9–12, 2014, pp. 89–96.
 
Federico, M., Bertoldi, N., Cettolo, M. (2008). IRSTLM: an open source toolkit for handling large scale language models. In: INTERSPEECH 2008, 9th Annual Conference of the International Speech Communication Association, pp. 1618–1621.
 
Glotić, A., Zamuda, A. (2015). Short-term combined economic and emission hydrothermal optimization by surrogate differential evolution. Applied Energy, 42–56.
 
Hasler, E., Haddow, B., Koehn, P. (2011). Margin infused relaxed algorithm for moses. The Prague Bulletin of Mathematical Linguistics, 69–78.
 
Hopkins, M., May, J. (2011). Tuning as ranking. In: EMNLP, pp. 1352–1362.
 
Kasparaitis, P., Anbinderis, T. (2014). Building text corpus for unit selection synthesis. Informatica, 551–562.
 
Koehn, P. (2004). Statistical significance tests for machine translation evaluation. In: Proceedings of EMNLP 2004, pp. 388–395.
 
Koehn, P. (2005). Europarl: a parallel corpus for statistical machine translation. In: MT Summit 2005.
 
Koehn, P., Birch, A., Steinberger, R. (2009). 462 machine translation systems for europe. In: MT Summit XII.
 
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C.J., Bojar, O., Constantin, A., Herbst, E. (2007). Moses: Open source toolkit for statistical machine translation. In: ACL Demo and Poster Session.
 
Lopez, A. (1993). Statistical machine translation. ACM Computing Surveys, 40(3), 1–49.
 
Mlakar, U., Brest, J., Zamuda, A. (2014). Differential evolution for self-adaptive triangular brushstrokes. In: BIOMA Workshop, pp. 105–116.
 
Neri, F., Tirronen, V. (2010). Recent advances in differential evolution: a survey and experimental analysis. Artificial Intelligence Review, 61–106.
 
Och, F.J. (2003). Minimum error rate training for statistical machine translation. In: ACL, 160–167.
 
Och, F.J., Ney, H. (2000). Improved statistical alignment models. In: ACL, pp. 440–447.
 
Och, F.J., Ney, H. (2002). Discriminative training and maximum entropy models for statistical machine translation. In: ACL, pp. 295–302.
 
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J. (2002). BLEU: a method for automatic evaluation of machine translation. In: ACL, pp. 311–318.
 
Price, K., Storn, R., Lampinen, J. (2005). Differential Evolution, A Practical Approach to Global Optimization. Springer.
 
Saon, G., Ramabhadran, B., Zweig, G. (2006). On the effect of word error rate on automated quality monitoring. In: Proceedings of Spoken Language Technology Workshop, pp. 106–109.
 
Sepesy Maučec, M., Brest, J. (2010). Reduction of morpho-syntactic features in statistical machine translation of highly inflective language. Informatica, 95–116.
 
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J. (2006). A study of translation edit rate with targeted human annotation. In: Proceedings of Association for Machine Translation in the Americas.
 
Specia, L. (2010). Fundamental and New Approaches to Statistical Machine Translation.
 
Steinberger, R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufis, D., Varga, D. (2006). The JRC-acquis: a multilingual aligned parallel corpus with 20+ languages. In: LREC.
 
Storn, R., Price, K. (1997). Differential evolution – a simple and efficient heuristic for global optimisation over continuous spaces. Journal of Global Optimization, 341–359.
 
Varga, D., Nemeth, L., Halacsy, P., Kornai, A., Tron, V., Nagy, V. (2005). Parallel corpora for medium density languages. In: Proceedings of the RANLP 2005n, pp. 590–596.
 
Watanabe, T., Suzuki, J., Tsukada, H., Isozaki, H. (2007). Online large-margin training for statistical machine translation. In: EMNLP–CoNLL, pp. 764–773.
 
Zhang, J., Sanderson, A.C. (2009). JADE: adaptive differential evolution with optional external archive. IEEE Transactions on Evolutionary Computation, 945–958.
 
Zhou, A., Qu, B.-Y., Li, H., Zhao, S.-Z., Suganthan, P.N., Zhang, Q. (2011). Multiobjective evolutionary algorithms: a survey of the state of the art. Swarm and Evolutionary Computation, 32–49.

Biographies

Dugonik Jani
jani.dugonik@um.si

J. Dugonik received his BSc and MSc in computer science from the University of Maribor, Maribor, Slovenia, in 2010 and 2013. He is currently a teaching assistant at the Faculty of Electrical Engineering and Computer Science, University of Maribor, Maribor, Slovenia. He has worked in the Laboratory for Computer Architecture and Programming Languages, University of Maribor, since 2011. From 2017 he is working in the Laboratory for Real-Time Systems. His research interests include evolutionary computing, optimization, natural language processing and deep learning.

Bošković Borko
borko.boskovic@um.si

J. Brest received his BSc, MSc, and PhD in computer science from the University of Maribor, Maribor, Slovenia, in 1995, 1998, and 2000, respectively. He has been with the Laboratory for Computer Architecture and Programming Languages, University of Maribor, since 1993. He is currently a full professor and head of the Laboratory for Computer Architecture and Programming Languages.

Brest Janez
janez.brest@um.si

B. Bošković received his BSc and PhD in computer science from the University of Maribor, Maribor, Slovenia, in 2004 and 2010. He is currently an assistant professor at the Faculty of Electrical Engineering and Computer Science, University of Maribor, Maribor, Slovenia. He has worked in the Laboratory for Computer Architecture and Programming Languages, University of Maribor, since 2000. His research interests include evolutionary computing, optimization, natural language processing and programming languages.

Sepesy Maučec Mirjam
mirjam.sepesy@um.si

M. Sepesy Maučec received her BSc and PhD in computer science from the Faculty of Electrical Engineering and Computer Science at the University of Maribor in 1996 and 2001, respectively. She is currently an associate professor at the same faculty. Her research interests include language modelling, statistical machine translation, computational linguistics and evolutionary computing.


Full article Related articles Cited by PDF XML
Full article Related articles Cited by PDF XML

Copyright
© 2019 Vilnius University
by logo by logo
Open access article under the CC BY license.

Keywords
statistical machine translation differential evolution optimization

Funding
The authors acknowledge the financial support from the Slovenian Research Agency (Research Core Funding No. P2-0041 – Computer Systems, Methodologies, and Intelligent Services; P2-0069 – Advanced methods of interaction in telecommunication).

Metrics
since January 2020
1369

Article info
views

769

Full article
views

692

PDF
downloads

247

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

INFORMATICA

  • Online ISSN: 1822-8844
  • Print ISSN: 0868-4952
  • Copyright © 2023 Vilnius University

About

  • About journal

For contributors

  • OA Policy
  • Submit your article
  • Instructions for Referees
    •  

    •  

Contact us

  • Institute of Data Science and Digital Technologies
  • Vilnius University

    Akademijos St. 4

    08412 Vilnius, Lithuania

    Phone: (+370 5) 2109 338

    E-mail: informatica@mii.vu.lt

    https://informatica.vu.lt/journal/INFORMATICA
Powered by PubliMill  •  Privacy policy