An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora

Ying Zhang, Stephan Vogel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

31 Citations (Scopus)

Abstract

Most statistical machine translation (SMT) systems use phrase-to-phrase translations to capture local context information, leading to better lexical choices and more reliable word reordering. Long phrases capture more contexts than short phrases and result in better translation qualities. On the other hand, the increasing amount of bilingual data poses serious problems for storing all possible phrases. In this paper, we describe a novel phrase-to-phrase alignment model which allows for arbitrarily long phrases and works for very large bilingual corpora. This model is very efficient in both time and space and the resulting translations are better than the state-of-the-art systems.

Original languageEnglish
Title of host publicationEuropean Association for Machine Translation, EAMT 2005 - 10th Annual Conference
Pages294-301
Number of pages8
Publication statusPublished - 1 Dec 2005
Externally publishedYes
Event10th Annual Conference on European Association for Machine Translation, EAMT 2005 - Budapest, Hungary
Duration: 30 May 200531 May 2005

Other

Other10th Annual Conference on European Association for Machine Translation, EAMT 2005
CountryHungary
CityBudapest
Period30/5/0531/5/05

Fingerprint

Alignment
Machine Translation System
Statistical Machine Translation
Lexical Choices

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Software

Cite this

Zhang, Y., & Vogel, S. (2005). An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora. In European Association for Machine Translation, EAMT 2005 - 10th Annual Conference (pp. 294-301)

An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora. / Zhang, Ying; Vogel, Stephan.

European Association for Machine Translation, EAMT 2005 - 10th Annual Conference. 2005. p. 294-301.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhang, Y & Vogel, S 2005, An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora. in European Association for Machine Translation, EAMT 2005 - 10th Annual Conference. pp. 294-301, 10th Annual Conference on European Association for Machine Translation, EAMT 2005, Budapest, Hungary, 30/5/05.
Zhang Y, Vogel S. An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora. In European Association for Machine Translation, EAMT 2005 - 10th Annual Conference. 2005. p. 294-301
Zhang, Ying ; Vogel, Stephan. / An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora. European Association for Machine Translation, EAMT 2005 - 10th Annual Conference. 2005. pp. 294-301
@inproceedings{b5e4b17e536d44bb9639590c7d991e37,
title = "An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora",
abstract = "Most statistical machine translation (SMT) systems use phrase-to-phrase translations to capture local context information, leading to better lexical choices and more reliable word reordering. Long phrases capture more contexts than short phrases and result in better translation qualities. On the other hand, the increasing amount of bilingual data poses serious problems for storing all possible phrases. In this paper, we describe a novel phrase-to-phrase alignment model which allows for arbitrarily long phrases and works for very large bilingual corpora. This model is very efficient in both time and space and the resulting translations are better than the state-of-the-art systems.",
author = "Ying Zhang and Stephan Vogel",
year = "2005",
month = "12",
day = "1",
language = "English",
pages = "294--301",
booktitle = "European Association for Machine Translation, EAMT 2005 - 10th Annual Conference",

}

TY - GEN

T1 - An efficient phrase-to-phrase alignment model for arbitrarily long phrase and large corpora

AU - Zhang, Ying

AU - Vogel, Stephan

PY - 2005/12/1

Y1 - 2005/12/1

N2 - Most statistical machine translation (SMT) systems use phrase-to-phrase translations to capture local context information, leading to better lexical choices and more reliable word reordering. Long phrases capture more contexts than short phrases and result in better translation qualities. On the other hand, the increasing amount of bilingual data poses serious problems for storing all possible phrases. In this paper, we describe a novel phrase-to-phrase alignment model which allows for arbitrarily long phrases and works for very large bilingual corpora. This model is very efficient in both time and space and the resulting translations are better than the state-of-the-art systems.

AB - Most statistical machine translation (SMT) systems use phrase-to-phrase translations to capture local context information, leading to better lexical choices and more reliable word reordering. Long phrases capture more contexts than short phrases and result in better translation qualities. On the other hand, the increasing amount of bilingual data poses serious problems for storing all possible phrases. In this paper, we describe a novel phrase-to-phrase alignment model which allows for arbitrarily long phrases and works for very large bilingual corpora. This model is very efficient in both time and space and the resulting translations are better than the state-of-the-art systems.

UR - http://www.scopus.com/inward/record.url?scp=84863136730&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863136730&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84863136730

SP - 294

EP - 301

BT - European Association for Machine Translation, EAMT 2005 - 10th Annual Conference

ER -