Arabic to English person name transliteration using Twitter

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Social media outlets are providing new opportunities for harvesting valuable resources. We present a novel approach for mining data from Twitter for the purpose of building transliteration resources and systems. Such resources are crucial in translation and retrieval tasks. We demonstrate the benefits of the approach on Arabic to English transliteration. The contribution of this approach includes the size of data that can be collected and exploited within the span of a limited time; the approach is very generic and can be adopted to other languages and the ability of the approach to cope with new transliteration phenomena and trends. A statistical transliteration system built using this data improved a comparable system built from Wikipedia wikilinks data.

Original languageEnglish
Title of host publicationProceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016
PublisherEuropean Language Resources Association (ELRA)
Pages351-355
Number of pages5
ISBN (Electronic)9782951740891
Publication statusPublished - 1 Jan 2016
Event10th International Conference on Language Resources and Evaluation, LREC 2016 - Portoroz, Slovenia
Duration: 23 May 201628 May 2016

Other

Other10th International Conference on Language Resources and Evaluation, LREC 2016
CountrySlovenia
CityPortoroz
Period23/5/1628/5/16

Fingerprint

twitter
human being
resources
Wikipedia
social media
Transliteration
Person
Names
ability
trend
language
Resources

Keywords

  • Arabic language variations
  • Named entities
  • Social media
  • Transliteration
  • Tweet normalization

ASJC Scopus subject areas

  • Linguistics and Language
  • Library and Information Sciences
  • Language and Linguistics
  • Education

Cite this

Mubarak, H., & Abdelali, A. (2016). Arabic to English person name transliteration using Twitter. In Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016 (pp. 351-355). European Language Resources Association (ELRA).

Arabic to English person name transliteration using Twitter. / Mubarak, Hamdy; Abdelali, Ahmed.

Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016. European Language Resources Association (ELRA), 2016. p. 351-355.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Mubarak, H & Abdelali, A 2016, Arabic to English person name transliteration using Twitter. in Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016. European Language Resources Association (ELRA), pp. 351-355, 10th International Conference on Language Resources and Evaluation, LREC 2016, Portoroz, Slovenia, 23/5/16.
Mubarak H, Abdelali A. Arabic to English person name transliteration using Twitter. In Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016. European Language Resources Association (ELRA). 2016. p. 351-355
Mubarak, Hamdy ; Abdelali, Ahmed. / Arabic to English person name transliteration using Twitter. Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016. European Language Resources Association (ELRA), 2016. pp. 351-355
@inproceedings{dea0a08a48f747cc9142ee2c11ea5674,
title = "Arabic to English person name transliteration using Twitter",
abstract = "Social media outlets are providing new opportunities for harvesting valuable resources. We present a novel approach for mining data from Twitter for the purpose of building transliteration resources and systems. Such resources are crucial in translation and retrieval tasks. We demonstrate the benefits of the approach on Arabic to English transliteration. The contribution of this approach includes the size of data that can be collected and exploited within the span of a limited time; the approach is very generic and can be adopted to other languages and the ability of the approach to cope with new transliteration phenomena and trends. A statistical transliteration system built using this data improved a comparable system built from Wikipedia wikilinks data.",
keywords = "Arabic language variations, Named entities, Social media, Transliteration, Tweet normalization",
author = "Hamdy Mubarak and Ahmed Abdelali",
year = "2016",
month = "1",
day = "1",
language = "English",
pages = "351--355",
booktitle = "Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016",
publisher = "European Language Resources Association (ELRA)",

}

TY - GEN

T1 - Arabic to English person name transliteration using Twitter

AU - Mubarak, Hamdy

AU - Abdelali, Ahmed

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Social media outlets are providing new opportunities for harvesting valuable resources. We present a novel approach for mining data from Twitter for the purpose of building transliteration resources and systems. Such resources are crucial in translation and retrieval tasks. We demonstrate the benefits of the approach on Arabic to English transliteration. The contribution of this approach includes the size of data that can be collected and exploited within the span of a limited time; the approach is very generic and can be adopted to other languages and the ability of the approach to cope with new transliteration phenomena and trends. A statistical transliteration system built using this data improved a comparable system built from Wikipedia wikilinks data.

AB - Social media outlets are providing new opportunities for harvesting valuable resources. We present a novel approach for mining data from Twitter for the purpose of building transliteration resources and systems. Such resources are crucial in translation and retrieval tasks. We demonstrate the benefits of the approach on Arabic to English transliteration. The contribution of this approach includes the size of data that can be collected and exploited within the span of a limited time; the approach is very generic and can be adopted to other languages and the ability of the approach to cope with new transliteration phenomena and trends. A statistical transliteration system built using this data improved a comparable system built from Wikipedia wikilinks data.

KW - Arabic language variations

KW - Named entities

KW - Social media

KW - Transliteration

KW - Tweet normalization

UR - http://www.scopus.com/inward/record.url?scp=85034855157&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85034855157&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85034855157

SP - 351

EP - 355

BT - Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016

PB - European Language Resources Association (ELRA)

ER -