Parameter optimization for statistical machine translation: It pays to learn from hard examples

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Research on statistical machine translation has focused on particular translation directions, typically with English as the target language, e.g., from Arabic to English. When we reverse the translation direction, the multiple reference translations turn into multiple possible inputs, which offers both challenges and opportunities. We propose and evaluate several strategies for making use of these multiple inputs: (a) select one of the datasets, (b) select the best input for each sentence, and (c) synthesize an input for each sentence by fusing the available inputs. Surprisingly, we find out that it is best to tune on the hardest available input, not on the one that yields the highest BLEU score. This finding has implications on how to pick good translators and how to select useful data for parameter optimization in SMT.

Original languageEnglish
Title of host publicationInternational Conference Recent Advances in Natural Language Processing, RANLP
Pages504-510
Number of pages7
Publication statusPublished - 2013
Event9th International Conference on Recent Advances in Natural Language Processing, RANLP 2013 - Hissar, Bulgaria
Duration: 9 Sep 201311 Sep 2013

Other

Other9th International Conference on Recent Advances in Natural Language Processing, RANLP 2013
CountryBulgaria
CityHissar
Period9/9/1311/9/13

    Fingerprint

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Software
  • Electrical and Electronic Engineering

Cite this

Nakov, P., Khalid Al Obaidli, F., Guzmán, F., & Vogel, S. (2013). Parameter optimization for statistical machine translation: It pays to learn from hard examples. In International Conference Recent Advances in Natural Language Processing, RANLP (pp. 504-510)