Improving machine translation via triangulation and transliteration

Nadir Durrani, Philipp Koehn

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

In this paper we improve Urdu→Hindi⇄English machine translation through triangulation and transliteration. First we built an Urdu→Hindi SMT system by inducing triangulated and transliterated phrase-tables from Urdu-English and Hindi-English phrase translation models. We then use it to translate the Urdu part of the Urdu-English parallel data into Hindi, thus creating an artificial Hindi-English parallel data. Our phrase-translation strategies give an improvement of up to +3.35 BLEU points over a baseline Urdu→Hindi system. The synthesized data improve Hindi→English system by +0.35 and English→Hindi system by +1.0 BLEU points.

Original languageEnglish
Title of host publicationProceedings of the 17th Annual Conference of the European Association for Machine Translation, EAMT 2014
PublisherEuropean Association for Machine Translation
Pages71-78
Number of pages8
ISBN (Electronic)9789535537533
Publication statusPublished - 2014
Externally publishedYes
Event17th Annual Conference of the European Association for Machine Translation, EAMT 2014 - Dubrovnik, Croatia
Duration: 16 Jun 201418 Jun 2014

Other

Other17th Annual Conference of the European Association for Machine Translation, EAMT 2014
CountryCroatia
CityDubrovnik
Period16/6/1418/6/14

    Fingerprint

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Software

Cite this

Durrani, N., & Koehn, P. (2014). Improving machine translation via triangulation and transliteration. In Proceedings of the 17th Annual Conference of the European Association for Machine Translation, EAMT 2014 (pp. 71-78). European Association for Machine Translation.