EMDC: A semi-supervised approach forword alignment

Qin Gao, Francisco Guzman, Stephan Vogel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper proposes a novel semi supervised word alignment technique called EMDC that integrates discriminative and generative methods. A discriminative aligner is used to find high precision partial alignments that serve as constraints for a generative aligner which implements a constrained version of the EM algorithm. Experiments on small-size Chinese and Arabic tasks show consistent improvements on AER. We also experimented with moderate-size Chinese machine translation tasks and got an average of 0.5 point improvement on BLEU scores across five standard NIST test sets and four other test sets.

Original languageEnglish
Title of host publicationColing 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference
Pages349-357
Number of pages9
Volume2
Publication statusPublished - 1 Dec 2010
Externally publishedYes
Event23rd International Conference on Computational Linguistics, Coling 2010 - Beijing, China
Duration: 23 Aug 201027 Aug 2010

Other

Other23rd International Conference on Computational Linguistics, Coling 2010
CountryChina
CityBeijing
Period23/8/1027/8/10

Fingerprint

experiment
Experiments
Generative
Alignment
Experiment
Machine Translation

ASJC Scopus subject areas

  • Language and Linguistics
  • Computational Theory and Mathematics
  • Linguistics and Language

Cite this

Gao, Q., Guzman, F., & Vogel, S. (2010). EMDC: A semi-supervised approach forword alignment. In Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference (Vol. 2, pp. 349-357)

EMDC : A semi-supervised approach forword alignment. / Gao, Qin; Guzman, Francisco; Vogel, Stephan.

Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference. Vol. 2 2010. p. 349-357.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Gao, Q, Guzman, F & Vogel, S 2010, EMDC: A semi-supervised approach forword alignment. in Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference. vol. 2, pp. 349-357, 23rd International Conference on Computational Linguistics, Coling 2010, Beijing, China, 23/8/10.
Gao Q, Guzman F, Vogel S. EMDC: A semi-supervised approach forword alignment. In Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference. Vol. 2. 2010. p. 349-357
Gao, Qin ; Guzman, Francisco ; Vogel, Stephan. / EMDC : A semi-supervised approach forword alignment. Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference. Vol. 2 2010. pp. 349-357
@inproceedings{9e39331606b14596ac3f9c3e29577e72,
title = "EMDC: A semi-supervised approach forword alignment",
abstract = "This paper proposes a novel semi supervised word alignment technique called EMDC that integrates discriminative and generative methods. A discriminative aligner is used to find high precision partial alignments that serve as constraints for a generative aligner which implements a constrained version of the EM algorithm. Experiments on small-size Chinese and Arabic tasks show consistent improvements on AER. We also experimented with moderate-size Chinese machine translation tasks and got an average of 0.5 point improvement on BLEU scores across five standard NIST test sets and four other test sets.",
author = "Qin Gao and Francisco Guzman and Stephan Vogel",
year = "2010",
month = "12",
day = "1",
language = "English",
volume = "2",
pages = "349--357",
booktitle = "Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference",

}

TY - GEN

T1 - EMDC

T2 - A semi-supervised approach forword alignment

AU - Gao, Qin

AU - Guzman, Francisco

AU - Vogel, Stephan

PY - 2010/12/1

Y1 - 2010/12/1

N2 - This paper proposes a novel semi supervised word alignment technique called EMDC that integrates discriminative and generative methods. A discriminative aligner is used to find high precision partial alignments that serve as constraints for a generative aligner which implements a constrained version of the EM algorithm. Experiments on small-size Chinese and Arabic tasks show consistent improvements on AER. We also experimented with moderate-size Chinese machine translation tasks and got an average of 0.5 point improvement on BLEU scores across five standard NIST test sets and four other test sets.

AB - This paper proposes a novel semi supervised word alignment technique called EMDC that integrates discriminative and generative methods. A discriminative aligner is used to find high precision partial alignments that serve as constraints for a generative aligner which implements a constrained version of the EM algorithm. Experiments on small-size Chinese and Arabic tasks show consistent improvements on AER. We also experimented with moderate-size Chinese machine translation tasks and got an average of 0.5 point improvement on BLEU scores across five standard NIST test sets and four other test sets.

UR - http://www.scopus.com/inward/record.url?scp=80053417783&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80053417783&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:80053417783

VL - 2

SP - 349

EP - 357

BT - Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference

ER -