Informatica logo


Login Register

  1. Home
  2. Issues
  3. Volume 21, Issue 1 (2010)
  4. Reduction of Morpho-Syntactic Features i ...

Informatica

Information Submit your article For Referees Help ATTENTION!
  • Article info
  • Related articles
  • Cited by
  • More
    Article info Related articles Cited by

Reduction of Morpho-Syntactic Features in Statistical Machine Translation of Highly Inflective Language
Volume 21, Issue 1 (2010), pp. 95–116
Mirjam Sepesy maučec   Janez Brest  

Authors

 
Placeholder
https://doi.org/10.15388/Informatica.2010.275
Pub. online: 1 January 2010      Type: Research Article     

Received
1 April 2008
Accepted
1 May 2008
Published
1 January 2010

Abstract

We address the problem of statistical machine translation from highly inflective language to less inflective one. The characteristics of inflective languages are generally not taken into account by the statistical machine translation system. Existing translation systems often treat different inflected word forms of the same lemma as if they were independent of each other, although some interdependencies exist. On the other hand we know that if we reduce inflected word forms to common lemmas, some information is lost. It would be reasonable to eliminate only the variations in inflected word forms, which are not relevant for translation. Inflectional features of words are defined by morpho-syntactic descriptions (MSD) tags and we want reduce them. To do this the explicit knowledge about both languages (source and target language) is needed. The idea of the paper is to find the information-bearing MSDs in source language by data-driven approach. The task is performed by a global optimization algorithm, named Differential Evolution. The experiments were performed using freely available parallel English–Slovenian corpus SVEZ-IJS, which is lemmatized and annotated with MSD tags. The results show a promising direction toward optimal subset of morpho-syntactic features.

Related articles Cited by PDF XML
Related articles Cited by PDF XML

Copyright
No copyright data available.

Keywords
statistical machine translation inflective language morpho-syntactic description and Bleu metric

Metrics
since January 2020
528

Article info
views

0

Full article
views

215

PDF
downloads

181

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

INFORMATICA

  • Online ISSN: 1822-8844
  • Print ISSN: 0868-4952
  • Copyright © 2023 Vilnius University

About

  • About journal

For contributors

  • OA Policy
  • Submit your article
  • Instructions for Referees
    •  

    •  

Contact us

  • Institute of Data Science and Digital Technologies
  • Vilnius University

    Akademijos St. 4

    08412 Vilnius, Lithuania

    Phone: (+370 5) 2109 338

    E-mail: informatica@mii.vu.lt

    https://informatica.vu.lt/journal/INFORMATICA
Powered by PubliMill  •  Privacy policy