Robust estimation of feature weights in Statistical Machine Translation

Cristina España-Bonet, Lluís Màrquez

Research output: Contribution to conferencePaper

2 Citations (Scopus)

Abstract

Weights of the various components in a standard Statistical Machine Translation model are usually estimated via Minimum Error Rate Training. With this, one finds their optimum value on a development set with the expectation that these optimal weights generalise well to other test sets. However, this is not always the case when domains differ. This work uses a perceptron algorithm to learn more robust weights to be used on out-of-domain corpora without the need for specialised data. For an Arabic-to-English translation system, the generalisation of weights represents an improvement of more than 2 points of BLEU with respect to the MERT baseline using the same information.

Original languageEnglish
Publication statusPublished - 1 Dec 2010
Event14th Annual Conference of the European Association for Machine Translation, EAMT 2010 - Saint-Raphael, France
Duration: 27 May 201028 May 2010

Other

Other14th Annual Conference of the European Association for Machine Translation, EAMT 2010
CountryFrance
CitySaint-Raphael
Period27/5/1028/5/10

    Fingerprint

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Software

Cite this

España-Bonet, C., & Màrquez, L. (2010). Robust estimation of feature weights in Statistical Machine Translation. Paper presented at 14th Annual Conference of the European Association for Machine Translation, EAMT 2010, Saint-Raphael, France.