An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora

Ying Zhang, Stephan Vogel

Research output: Contribution to conferencePaper

Abstract

Most statistical machine translation (SMT) systems use phrase-to-phrase translations to capture local context information, leading to better lexical choices and more reliable word reordering. Long phrases capture more contexts than short phrases and result in better translation qualities. On the other hand, the increasing amount of bilingual data poses serious problems for storing all possible phrases. In this paper, we describe a novel phrase-to-phrase alignment model which allows for arbitrarily long phrases and works for very large bilingual corpora. This model is very efficient in both time and space and the resulting translations are better than the state-of-the-art systems.

Original languageEnglish
Pages294-301
Number of pages8
Publication statusPublished - 1 Dec 2005
Event10th Annual Conference on European Association for Machine Translation, EAMT 2005 - Budapest, Hungary
Duration: 30 May 200531 May 2005

Other

Other10th Annual Conference on European Association for Machine Translation, EAMT 2005
CountryHungary
CityBudapest
Period30/5/0531/5/05

    Fingerprint

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Software

Cite this

Zhang, Y., & Vogel, S. (2005). An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora. 294-301. Paper presented at 10th Annual Conference on European Association for Machine Translation, EAMT 2005, Budapest, Hungary.