Hindi-to-Urdu machine translation through transliteration

Nadir Durrani, Hassan Sajjad, Alexander Fraser, Helmut Schmid

Research output: Chapter in Book/Report/Conference proceedingConference contribution

22 Citations (Scopus)

Abstract

We present a novel approach to integrate transliteration into Hindi-to-Urdu statistical machine translation. We propose two probabilistic models, based on conditional and joint probability formulations, that are novel solutions to the problem. Our models consider both transliteration and translation when translating a particular Hindi word given the context whereas in previous work transliteration is only used for translating OOV (out-of-vocabulary) words. We use transliteration as a tool for disambiguation of Hindi homonyms which can be both translated or transliterated or transliterated differently based on different contexts. We obtain final BLEU scores of 19.35 (conditional probability model) and 19.00 (joint probability model) as compared to 14.30 for a baseline phrase-based system and 16.25 for a system which transliterates OOV words in the baseline system. This indicates that transliteration is useful for more than only translating OOV words for language pairs like Hindi-Urdu.

Original languageEnglish
Title of host publicationACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
Pages465-474
Number of pages10
Publication statusPublished - 1 Dec 2010
Externally publishedYes
Event48th Annual Meeting of the Association for Computational Linguistics, ACL 2010 - Uppsala, Sweden
Duration: 11 Jul 201016 Jul 2010

Other

Other48th Annual Meeting of the Association for Computational Linguistics, ACL 2010
CountrySweden
CityUppsala
Period11/7/1016/7/10

Fingerprint

vocabulary
Urdu
Machine Translation
Transliteration
language
Vocabulary
Translating

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Cite this

Durrani, N., Sajjad, H., Fraser, A., & Schmid, H. (2010). Hindi-to-Urdu machine translation through transliteration. In ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 465-474)

Hindi-to-Urdu machine translation through transliteration. / Durrani, Nadir; Sajjad, Hassan; Fraser, Alexander; Schmid, Helmut.

ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. 2010. p. 465-474.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Durrani, N, Sajjad, H, Fraser, A & Schmid, H 2010, Hindi-to-Urdu machine translation through transliteration. in ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. pp. 465-474, 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, Uppsala, Sweden, 11/7/10.
Durrani N, Sajjad H, Fraser A, Schmid H. Hindi-to-Urdu machine translation through transliteration. In ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. 2010. p. 465-474
Durrani, Nadir ; Sajjad, Hassan ; Fraser, Alexander ; Schmid, Helmut. / Hindi-to-Urdu machine translation through transliteration. ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference. 2010. pp. 465-474
@inproceedings{038d127f508947f3baf0c38a5049025d,
title = "Hindi-to-Urdu machine translation through transliteration",
abstract = "We present a novel approach to integrate transliteration into Hindi-to-Urdu statistical machine translation. We propose two probabilistic models, based on conditional and joint probability formulations, that are novel solutions to the problem. Our models consider both transliteration and translation when translating a particular Hindi word given the context whereas in previous work transliteration is only used for translating OOV (out-of-vocabulary) words. We use transliteration as a tool for disambiguation of Hindi homonyms which can be both translated or transliterated or transliterated differently based on different contexts. We obtain final BLEU scores of 19.35 (conditional probability model) and 19.00 (joint probability model) as compared to 14.30 for a baseline phrase-based system and 16.25 for a system which transliterates OOV words in the baseline system. This indicates that transliteration is useful for more than only translating OOV words for language pairs like Hindi-Urdu.",
author = "Nadir Durrani and Hassan Sajjad and Alexander Fraser and Helmut Schmid",
year = "2010",
month = "12",
day = "1",
language = "English",
isbn = "9781617388088",
pages = "465--474",
booktitle = "ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference",

}

TY - GEN

T1 - Hindi-to-Urdu machine translation through transliteration

AU - Durrani, Nadir

AU - Sajjad, Hassan

AU - Fraser, Alexander

AU - Schmid, Helmut

PY - 2010/12/1

Y1 - 2010/12/1

N2 - We present a novel approach to integrate transliteration into Hindi-to-Urdu statistical machine translation. We propose two probabilistic models, based on conditional and joint probability formulations, that are novel solutions to the problem. Our models consider both transliteration and translation when translating a particular Hindi word given the context whereas in previous work transliteration is only used for translating OOV (out-of-vocabulary) words. We use transliteration as a tool for disambiguation of Hindi homonyms which can be both translated or transliterated or transliterated differently based on different contexts. We obtain final BLEU scores of 19.35 (conditional probability model) and 19.00 (joint probability model) as compared to 14.30 for a baseline phrase-based system and 16.25 for a system which transliterates OOV words in the baseline system. This indicates that transliteration is useful for more than only translating OOV words for language pairs like Hindi-Urdu.

AB - We present a novel approach to integrate transliteration into Hindi-to-Urdu statistical machine translation. We propose two probabilistic models, based on conditional and joint probability formulations, that are novel solutions to the problem. Our models consider both transliteration and translation when translating a particular Hindi word given the context whereas in previous work transliteration is only used for translating OOV (out-of-vocabulary) words. We use transliteration as a tool for disambiguation of Hindi homonyms which can be both translated or transliterated or transliterated differently based on different contexts. We obtain final BLEU scores of 19.35 (conditional probability model) and 19.00 (joint probability model) as compared to 14.30 for a baseline phrase-based system and 16.25 for a system which transliterates OOV words in the baseline system. This indicates that transliteration is useful for more than only translating OOV words for language pairs like Hindi-Urdu.

UR - http://www.scopus.com/inward/record.url?scp=84859975645&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84859975645&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781617388088

SP - 465

EP - 474

BT - ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

ER -