Full-text story alignment models for Chinese-English bilingual news corpora

Bing Zhao, Stephan Vogel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

In this paper, we describe the full-text story alignment on Chinese-English bilingual corpora of news data to mine potential parallel data for machine translation. Several standard information retrieval methods are tested and two translationmodel based alignment models are proposed and studied. Modeling the process of generating the parallel English story from Chinese story gives significant improvements over the standard information retrieval techniques. Refinements of the alignment model are also proposed and tested in detail. On one day's bilingual news collection, our methods improved the mean reciprocal rank from 0.31 to 0.68.

Original languageEnglish
Title of host publication7th International Conference on Spoken Language Processing, ICSLP 2002
PublisherInternational Speech Communication Association
Pages517-520
Number of pages4
Publication statusPublished - 2002
Externally publishedYes
Event7th International Conference on Spoken Language Processing, ICSLP 2002 - Denver, United States
Duration: 16 Sep 200220 Sep 2002

Other

Other7th International Conference on Spoken Language Processing, ICSLP 2002
CountryUnited States
CityDenver
Period16/9/0220/9/02

    Fingerprint

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Zhao, B., & Vogel, S. (2002). Full-text story alignment models for Chinese-English bilingual news corpora. In 7th International Conference on Spoken Language Processing, ICSLP 2002 (pp. 517-520). International Speech Communication Association.