Experiments with a Hindi-to-English Transfer-Based MT System Under a Miserly Data Scenario

Alon Lavie, Stephan Vogel, Lori Levin, Erik Peterson, Katharina Probst, Ariadna Font Llitjós, Rachel Reynolds, Jaime Carbonell, Richard Cohen

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

We describe an experiment designed to evaluate the capabilities of our trainable transfer-based (XFER) machine translation approach, as applied to the task of Hindi-to-English translation, and trained under an extremely limited data scenario. We compare the performance of the XFER approach with two corpus-based approaches—Statistical MT (SMT) and Example-based MT (EBMT)—under the limited data scenario. The results indicate that the XFER system significantly outperforms both EBMT and SMT in this scenario. Results also indicate that automatically learned transfer rules are effective in improving translation performance, compared with a baseline wordto-word translation version of the system. XFER system performance with a limited number of manually written transfer rules is, however, still better than the current automatically inferred rules. Furthermore, a “multiengine” version of our system that combined the output of the XFER and SMT systems and optimizes translation selection outperformed both individual systems.

Original languageEnglish
Pages (from-to)143-163
Number of pages21
JournalACM Transactions on Asian Language Information Processing
Volume2
Issue number2
DOIs
Publication statusPublished - 1 Jun 2003
Externally publishedYes

Fingerprint

Surface mount technology
Experiments

Keywords

  • Design
  • Evaluation
  • example-based machine translation
  • Experimentation
  • Hindi
  • Languages
  • limited data resources
  • machine learning
  • Measurement
  • multiengine machine translation
  • statistical translation
  • transfer rules

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Experiments with a Hindi-to-English Transfer-Based MT System Under a Miserly Data Scenario. / Lavie, Alon; Vogel, Stephan; Levin, Lori; Peterson, Erik; Probst, Katharina; Llitjós, Ariadna Font; Reynolds, Rachel; Carbonell, Jaime; Cohen, Richard.

In: ACM Transactions on Asian Language Information Processing, Vol. 2, No. 2, 01.06.2003, p. 143-163.

Research output: Contribution to journalArticle

Lavie, A, Vogel, S, Levin, L, Peterson, E, Probst, K, Llitjós, AF, Reynolds, R, Carbonell, J & Cohen, R 2003, 'Experiments with a Hindi-to-English Transfer-Based MT System Under a Miserly Data Scenario', ACM Transactions on Asian Language Information Processing, vol. 2, no. 2, pp. 143-163. https://doi.org/10.1145/974740.974747
Lavie, Alon ; Vogel, Stephan ; Levin, Lori ; Peterson, Erik ; Probst, Katharina ; Llitjós, Ariadna Font ; Reynolds, Rachel ; Carbonell, Jaime ; Cohen, Richard. / Experiments with a Hindi-to-English Transfer-Based MT System Under a Miserly Data Scenario. In: ACM Transactions on Asian Language Information Processing. 2003 ; Vol. 2, No. 2. pp. 143-163.
@article{5c65f8768e6b45b086a7327a3c89d73f,
title = "Experiments with a Hindi-to-English Transfer-Based MT System Under a Miserly Data Scenario",
abstract = "We describe an experiment designed to evaluate the capabilities of our trainable transfer-based (XFER) machine translation approach, as applied to the task of Hindi-to-English translation, and trained under an extremely limited data scenario. We compare the performance of the XFER approach with two corpus-based approaches—Statistical MT (SMT) and Example-based MT (EBMT)—under the limited data scenario. The results indicate that the XFER system significantly outperforms both EBMT and SMT in this scenario. Results also indicate that automatically learned transfer rules are effective in improving translation performance, compared with a baseline wordto-word translation version of the system. XFER system performance with a limited number of manually written transfer rules is, however, still better than the current automatically inferred rules. Furthermore, a “multiengine” version of our system that combined the output of the XFER and SMT systems and optimizes translation selection outperformed both individual systems.",
keywords = "Design, Evaluation, example-based machine translation, Experimentation, Hindi, Languages, limited data resources, machine learning, Measurement, multiengine machine translation, statistical translation, transfer rules",
author = "Alon Lavie and Stephan Vogel and Lori Levin and Erik Peterson and Katharina Probst and Llitj{\'o}s, {Ariadna Font} and Rachel Reynolds and Jaime Carbonell and Richard Cohen",
year = "2003",
month = "6",
day = "1",
doi = "10.1145/974740.974747",
language = "English",
volume = "2",
pages = "143--163",
journal = "ACM Transactions on Asian Language Information Processing",
issn = "1530-0226",
publisher = "Association for Computing Machinery (ACM)",
number = "2",

}

TY - JOUR

T1 - Experiments with a Hindi-to-English Transfer-Based MT System Under a Miserly Data Scenario

AU - Lavie, Alon

AU - Vogel, Stephan

AU - Levin, Lori

AU - Peterson, Erik

AU - Probst, Katharina

AU - Llitjós, Ariadna Font

AU - Reynolds, Rachel

AU - Carbonell, Jaime

AU - Cohen, Richard

PY - 2003/6/1

Y1 - 2003/6/1

N2 - We describe an experiment designed to evaluate the capabilities of our trainable transfer-based (XFER) machine translation approach, as applied to the task of Hindi-to-English translation, and trained under an extremely limited data scenario. We compare the performance of the XFER approach with two corpus-based approaches—Statistical MT (SMT) and Example-based MT (EBMT)—under the limited data scenario. The results indicate that the XFER system significantly outperforms both EBMT and SMT in this scenario. Results also indicate that automatically learned transfer rules are effective in improving translation performance, compared with a baseline wordto-word translation version of the system. XFER system performance with a limited number of manually written transfer rules is, however, still better than the current automatically inferred rules. Furthermore, a “multiengine” version of our system that combined the output of the XFER and SMT systems and optimizes translation selection outperformed both individual systems.

AB - We describe an experiment designed to evaluate the capabilities of our trainable transfer-based (XFER) machine translation approach, as applied to the task of Hindi-to-English translation, and trained under an extremely limited data scenario. We compare the performance of the XFER approach with two corpus-based approaches—Statistical MT (SMT) and Example-based MT (EBMT)—under the limited data scenario. The results indicate that the XFER system significantly outperforms both EBMT and SMT in this scenario. Results also indicate that automatically learned transfer rules are effective in improving translation performance, compared with a baseline wordto-word translation version of the system. XFER system performance with a limited number of manually written transfer rules is, however, still better than the current automatically inferred rules. Furthermore, a “multiengine” version of our system that combined the output of the XFER and SMT systems and optimizes translation selection outperformed both individual systems.

KW - Design

KW - Evaluation

KW - example-based machine translation

KW - Experimentation

KW - Hindi

KW - Languages

KW - limited data resources

KW - machine learning

KW - Measurement

KW - multiengine machine translation

KW - statistical translation

KW - transfer rules

UR - http://www.scopus.com/inward/record.url?scp=35048819551&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35048819551&partnerID=8YFLogxK

U2 - 10.1145/974740.974747

DO - 10.1145/974740.974747

M3 - Article

AN - SCOPUS:35048819551

VL - 2

SP - 143

EP - 163

JO - ACM Transactions on Asian Language Information Processing

JF - ACM Transactions on Asian Language Information Processing

SN - 1530-0226

IS - 2

ER -