A statistical model for unsupervised and semi-supervised transliteration mining

Hassan Sajjad, Alexander Fraser, Helmut Schmid

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

We propose a novel model to automatically extract transliteration pairs from parallel corpora. Our model is efficient, language pair independent and mines transliteration pairs in a consistent fashion in both unsupervised and semi-supervised settings. We model transliteration mining as an interpolation of transliteration and non-transliteration sub-models. We evaluate on NEWS 2010 shared task data and on parallel corpora with competitive results.

Original languageEnglish
Title of host publication50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference
Pages469-477
Number of pages9
Volume1
Publication statusPublished - 1 Dec 2012
Externally publishedYes
Event50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Jeju Island, Korea, Republic of
Duration: 8 Jul 201214 Jul 2012

Other

Other50th Annual Meeting of the Association for Computational Linguistics, ACL 2012
CountryKorea, Republic of
CityJeju Island
Period8/7/1214/7/12

Fingerprint

Interpolation
Statistical Models

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Software

Cite this

Sajjad, H., Fraser, A., & Schmid, H. (2012). A statistical model for unsupervised and semi-supervised transliteration mining. In 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference (Vol. 1, pp. 469-477)

A statistical model for unsupervised and semi-supervised transliteration mining. / Sajjad, Hassan; Fraser, Alexander; Schmid, Helmut.

50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference. Vol. 1 2012. p. 469-477.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Sajjad, H, Fraser, A & Schmid, H 2012, A statistical model for unsupervised and semi-supervised transliteration mining. in 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference. vol. 1, pp. 469-477, 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012, Jeju Island, Korea, Republic of, 8/7/12.
Sajjad H, Fraser A, Schmid H. A statistical model for unsupervised and semi-supervised transliteration mining. In 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference. Vol. 1. 2012. p. 469-477
Sajjad, Hassan ; Fraser, Alexander ; Schmid, Helmut. / A statistical model for unsupervised and semi-supervised transliteration mining. 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference. Vol. 1 2012. pp. 469-477
@inproceedings{3dfbf485ef184d06ab6de52e4a2e55ef,
title = "A statistical model for unsupervised and semi-supervised transliteration mining",
abstract = "We propose a novel model to automatically extract transliteration pairs from parallel corpora. Our model is efficient, language pair independent and mines transliteration pairs in a consistent fashion in both unsupervised and semi-supervised settings. We model transliteration mining as an interpolation of transliteration and non-transliteration sub-models. We evaluate on NEWS 2010 shared task data and on parallel corpora with competitive results.",
author = "Hassan Sajjad and Alexander Fraser and Helmut Schmid",
year = "2012",
month = "12",
day = "1",
language = "English",
isbn = "9781937284244",
volume = "1",
pages = "469--477",
booktitle = "50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference",

}

TY - GEN

T1 - A statistical model for unsupervised and semi-supervised transliteration mining

AU - Sajjad, Hassan

AU - Fraser, Alexander

AU - Schmid, Helmut

PY - 2012/12/1

Y1 - 2012/12/1

N2 - We propose a novel model to automatically extract transliteration pairs from parallel corpora. Our model is efficient, language pair independent and mines transliteration pairs in a consistent fashion in both unsupervised and semi-supervised settings. We model transliteration mining as an interpolation of transliteration and non-transliteration sub-models. We evaluate on NEWS 2010 shared task data and on parallel corpora with competitive results.

AB - We propose a novel model to automatically extract transliteration pairs from parallel corpora. Our model is efficient, language pair independent and mines transliteration pairs in a consistent fashion in both unsupervised and semi-supervised settings. We model transliteration mining as an interpolation of transliteration and non-transliteration sub-models. We evaluate on NEWS 2010 shared task data and on parallel corpora with competitive results.

UR - http://www.scopus.com/inward/record.url?scp=84878188380&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84878188380&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781937284244

VL - 1

SP - 469

EP - 477

BT - 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference

ER -