EMDC: A semi-supervised approach forword alignment

Qin Gao, Francisco Guzman, Stephan Vogel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper proposes a novel semi supervised word alignment technique called EMDC that integrates discriminative and generative methods. A discriminative aligner is used to find high precision partial alignments that serve as constraints for a generative aligner which implements a constrained version of the EM algorithm. Experiments on small-size Chinese and Arabic tasks show consistent improvements on AER. We also experimented with moderate-size Chinese machine translation tasks and got an average of 0.5 point improvement on BLEU scores across five standard NIST test sets and four other test sets.

Original languageEnglish
Title of host publicationColing 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference
Pages349-357
Number of pages9
Volume2
Publication statusPublished - 1 Dec 2010
Externally publishedYes
Event23rd International Conference on Computational Linguistics, Coling 2010 - Beijing, China
Duration: 23 Aug 201027 Aug 2010

Other

Other23rd International Conference on Computational Linguistics, Coling 2010
CountryChina
CityBeijing
Period23/8/1027/8/10

    Fingerprint

ASJC Scopus subject areas

  • Language and Linguistics
  • Computational Theory and Mathematics
  • Linguistics and Language

Cite this

Gao, Q., Guzman, F., & Vogel, S. (2010). EMDC: A semi-supervised approach forword alignment. In Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference (Vol. 2, pp. 349-357)