ICE-TEA

In-context expansion and translation of english abbreviations

Waleed Ammar, Kareem Darwish, Ali El Kahki, Khaled Hafez

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

The wide use of abbreviations in modern texts poses interesting challenges and opportunities in the field of NLP. In addition to their dynamic nature, abbreviations are highly polysemous with respect to regular words. Technologies that exhibit some level of language understanding may be adversely impacted by the presence of abbreviations. This paper addresses two related problems: (1) expansion of abbreviations given a context, and (2) translation of sentences with abbreviations. First, an efficient retrieval-based method for English abbreviation expansion is presented. Then, a hybrid system is used to pick among simple abbreviation-translation methods. The hybrid system achieves an improvement of 1.48 BLEU points over the baseline MT system, using sentences that contain abbreviations as a test set.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages41-54
Number of pages14
Volume6609 LNCS
EditionPART 2
DOIs
Publication statusPublished - 9 Mar 2011
Externally publishedYes
Event12th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2011 - Tokyo, Japan
Duration: 20 Feb 201126 Feb 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 2
Volume6609 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other12th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2011
CountryJapan
CityTokyo
Period20/2/1126/2/11

Fingerprint

Abbreviation
Hybrid systems
Hybrid Systems
Context
Test Set
Baseline
Retrieval

Keywords

  • abbreviations
  • statistical machine translation
  • word sense disambiguation

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Ammar, W., Darwish, K., El Kahki, A., & Hafez, K. (2011). ICE-TEA: In-context expansion and translation of english abbreviations. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (PART 2 ed., Vol. 6609 LNCS, pp. 41-54). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6609 LNCS, No. PART 2). https://doi.org/10.1007/978-3-642-19437-5_4

ICE-TEA : In-context expansion and translation of english abbreviations. / Ammar, Waleed; Darwish, Kareem; El Kahki, Ali; Hafez, Khaled.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6609 LNCS PART 2. ed. 2011. p. 41-54 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 6609 LNCS, No. PART 2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ammar, W, Darwish, K, El Kahki, A & Hafez, K 2011, ICE-TEA: In-context expansion and translation of english abbreviations. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 2 edn, vol. 6609 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 2, vol. 6609 LNCS, pp. 41-54, 12th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2011, Tokyo, Japan, 20/2/11. https://doi.org/10.1007/978-3-642-19437-5_4
Ammar W, Darwish K, El Kahki A, Hafez K. ICE-TEA: In-context expansion and translation of english abbreviations. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). PART 2 ed. Vol. 6609 LNCS. 2011. p. 41-54. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 2). https://doi.org/10.1007/978-3-642-19437-5_4
Ammar, Waleed ; Darwish, Kareem ; El Kahki, Ali ; Hafez, Khaled. / ICE-TEA : In-context expansion and translation of english abbreviations. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 6609 LNCS PART 2. ed. 2011. pp. 41-54 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 2).
@inproceedings{39fb2ba32dfb49b1a45e222dc3543b33,
title = "ICE-TEA: In-context expansion and translation of english abbreviations",
abstract = "The wide use of abbreviations in modern texts poses interesting challenges and opportunities in the field of NLP. In addition to their dynamic nature, abbreviations are highly polysemous with respect to regular words. Technologies that exhibit some level of language understanding may be adversely impacted by the presence of abbreviations. This paper addresses two related problems: (1) expansion of abbreviations given a context, and (2) translation of sentences with abbreviations. First, an efficient retrieval-based method for English abbreviation expansion is presented. Then, a hybrid system is used to pick among simple abbreviation-translation methods. The hybrid system achieves an improvement of 1.48 BLEU points over the baseline MT system, using sentences that contain abbreviations as a test set.",
keywords = "abbreviations, statistical machine translation, word sense disambiguation",
author = "Waleed Ammar and Kareem Darwish and {El Kahki}, Ali and Khaled Hafez",
year = "2011",
month = "3",
day = "9",
doi = "10.1007/978-3-642-19437-5_4",
language = "English",
isbn = "9783642194368",
volume = "6609 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 2",
pages = "41--54",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
edition = "PART 2",

}

TY - GEN

T1 - ICE-TEA

T2 - In-context expansion and translation of english abbreviations

AU - Ammar, Waleed

AU - Darwish, Kareem

AU - El Kahki, Ali

AU - Hafez, Khaled

PY - 2011/3/9

Y1 - 2011/3/9

N2 - The wide use of abbreviations in modern texts poses interesting challenges and opportunities in the field of NLP. In addition to their dynamic nature, abbreviations are highly polysemous with respect to regular words. Technologies that exhibit some level of language understanding may be adversely impacted by the presence of abbreviations. This paper addresses two related problems: (1) expansion of abbreviations given a context, and (2) translation of sentences with abbreviations. First, an efficient retrieval-based method for English abbreviation expansion is presented. Then, a hybrid system is used to pick among simple abbreviation-translation methods. The hybrid system achieves an improvement of 1.48 BLEU points over the baseline MT system, using sentences that contain abbreviations as a test set.

AB - The wide use of abbreviations in modern texts poses interesting challenges and opportunities in the field of NLP. In addition to their dynamic nature, abbreviations are highly polysemous with respect to regular words. Technologies that exhibit some level of language understanding may be adversely impacted by the presence of abbreviations. This paper addresses two related problems: (1) expansion of abbreviations given a context, and (2) translation of sentences with abbreviations. First, an efficient retrieval-based method for English abbreviation expansion is presented. Then, a hybrid system is used to pick among simple abbreviation-translation methods. The hybrid system achieves an improvement of 1.48 BLEU points over the baseline MT system, using sentences that contain abbreviations as a test set.

KW - abbreviations

KW - statistical machine translation

KW - word sense disambiguation

UR - http://www.scopus.com/inward/record.url?scp=79952266261&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79952266261&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-19437-5_4

DO - 10.1007/978-3-642-19437-5_4

M3 - Conference contribution

SN - 9783642194368

VL - 6609 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 41

EP - 54

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -