Experiments with a Hindi-to-English Transfer-Based MT System Under a Miserly Data Scenario

Alon Lavie, Stephan Vogel, Lori Levin, Erik Peterson, Katharina Probst, Ariadna Font Llitjós, Rachel Reynolds, Jaime Carbonell, Richard Cohen

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

We describe an experiment designed to evaluate the capabilities of our trainable transfer-based (XFER) machine translation approach, as applied to the task of Hindi-to-English translation, and trained under an extremely limited data scenario. We compare the performance of the XFER approach with two corpus-based approaches—Statistical MT (SMT) and Example-based MT (EBMT)—under the limited data scenario. The results indicate that the XFER system significantly outperforms both EBMT and SMT in this scenario. Results also indicate that automatically learned transfer rules are effective in improving translation performance, compared with a baseline wordto-word translation version of the system. XFER system performance with a limited number of manually written transfer rules is, however, still better than the current automatically inferred rules. Furthermore, a “multiengine” version of our system that combined the output of the XFER and SMT systems and optimizes translation selection outperformed both individual systems.

Original languageEnglish
Pages (from-to)143-163
Number of pages21
JournalACM Transactions on Asian Language Information Processing
Volume2
Issue number2
DOIs
Publication statusPublished - 1 Jun 2003
Externally publishedYes

    Fingerprint

Keywords

  • Design
  • Evaluation
  • example-based machine translation
  • Experimentation
  • Hindi
  • Languages
  • limited data resources
  • machine learning
  • Measurement
  • multiengine machine translation
  • statistical translation
  • transfer rules

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Lavie, A., Vogel, S., Levin, L., Peterson, E., Probst, K., Llitjós, A. F., Reynolds, R., Carbonell, J., & Cohen, R. (2003). Experiments with a Hindi-to-English Transfer-Based MT System Under a Miserly Data Scenario. ACM Transactions on Asian Language Information Processing, 2(2), 143-163. https://doi.org/10.1145/974740.974747