Improved named entity translation and bilingual named entity extraction

Huang Fei, Stephan Vogel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

26 Citations (Scopus)

Abstract

Translation of named entities (NE), including proper names, temporal and numerical expressions, is very important in multilingual natural language processing, like crosslingual information retrieval and statistical machine translation. We present an integrated approach to extract a named entity translation dictionary from a bilingual corpus while at the same time improving the named entity annotation quality. Starting from a bilingual corpus where the named entities are extracted independently for each language, a statistical alignment model is used to align the named entities. An iterative process is applied to extract named entity pairs with higher alignment probability. This leads to a smaller but cleaner named entity translation dictionary and also to a significant improvement of the monolingual named entity annotation quality for both languages. Experimental result shows that the dictionary size is reduced by 51.8% and the annotation quality is improved from 70.03 to 78.15 for Chinese and 73.38 to 81.46 in terms of F-score.

Original languageEnglish
Title of host publicationProceedings - 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages253-258
Number of pages6
ISBN (Print)0769518346, 9780769518343
DOIs
Publication statusPublished - 2002
Externally publishedYes
Event4th IEEE International Conference on Multimodal Interfaces, ICMI 2002 - Pittsburgh, United States
Duration: 14 Oct 200216 Oct 2002

Other

Other4th IEEE International Conference on Multimodal Interfaces, ICMI 2002
CountryUnited States
CityPittsburgh
Period14/10/0216/10/02

Fingerprint

Glossaries
Information retrieval
Processing

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Graphics and Computer-Aided Design
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture

Cite this

Fei, H., & Vogel, S. (2002). Improved named entity translation and bilingual named entity extraction. In Proceedings - 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002 (pp. 253-258). [1167002] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICMI.2002.1167002

Improved named entity translation and bilingual named entity extraction. / Fei, Huang; Vogel, Stephan.

Proceedings - 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002. Institute of Electrical and Electronics Engineers Inc., 2002. p. 253-258 1167002.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Fei, H & Vogel, S 2002, Improved named entity translation and bilingual named entity extraction. in Proceedings - 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002., 1167002, Institute of Electrical and Electronics Engineers Inc., pp. 253-258, 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002, Pittsburgh, United States, 14/10/02. https://doi.org/10.1109/ICMI.2002.1167002
Fei H, Vogel S. Improved named entity translation and bilingual named entity extraction. In Proceedings - 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002. Institute of Electrical and Electronics Engineers Inc. 2002. p. 253-258. 1167002 https://doi.org/10.1109/ICMI.2002.1167002
Fei, Huang ; Vogel, Stephan. / Improved named entity translation and bilingual named entity extraction. Proceedings - 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002. Institute of Electrical and Electronics Engineers Inc., 2002. pp. 253-258
@inproceedings{46fa0244233b43179541d83685d2ec03,
title = "Improved named entity translation and bilingual named entity extraction",
abstract = "Translation of named entities (NE), including proper names, temporal and numerical expressions, is very important in multilingual natural language processing, like crosslingual information retrieval and statistical machine translation. We present an integrated approach to extract a named entity translation dictionary from a bilingual corpus while at the same time improving the named entity annotation quality. Starting from a bilingual corpus where the named entities are extracted independently for each language, a statistical alignment model is used to align the named entities. An iterative process is applied to extract named entity pairs with higher alignment probability. This leads to a smaller but cleaner named entity translation dictionary and also to a significant improvement of the monolingual named entity annotation quality for both languages. Experimental result shows that the dictionary size is reduced by 51.8{\%} and the annotation quality is improved from 70.03 to 78.15 for Chinese and 73.38 to 81.46 in terms of F-score.",
author = "Huang Fei and Stephan Vogel",
year = "2002",
doi = "10.1109/ICMI.2002.1167002",
language = "English",
isbn = "0769518346",
pages = "253--258",
booktitle = "Proceedings - 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - Improved named entity translation and bilingual named entity extraction

AU - Fei, Huang

AU - Vogel, Stephan

PY - 2002

Y1 - 2002

N2 - Translation of named entities (NE), including proper names, temporal and numerical expressions, is very important in multilingual natural language processing, like crosslingual information retrieval and statistical machine translation. We present an integrated approach to extract a named entity translation dictionary from a bilingual corpus while at the same time improving the named entity annotation quality. Starting from a bilingual corpus where the named entities are extracted independently for each language, a statistical alignment model is used to align the named entities. An iterative process is applied to extract named entity pairs with higher alignment probability. This leads to a smaller but cleaner named entity translation dictionary and also to a significant improvement of the monolingual named entity annotation quality for both languages. Experimental result shows that the dictionary size is reduced by 51.8% and the annotation quality is improved from 70.03 to 78.15 for Chinese and 73.38 to 81.46 in terms of F-score.

AB - Translation of named entities (NE), including proper names, temporal and numerical expressions, is very important in multilingual natural language processing, like crosslingual information retrieval and statistical machine translation. We present an integrated approach to extract a named entity translation dictionary from a bilingual corpus while at the same time improving the named entity annotation quality. Starting from a bilingual corpus where the named entities are extracted independently for each language, a statistical alignment model is used to align the named entities. An iterative process is applied to extract named entity pairs with higher alignment probability. This leads to a smaller but cleaner named entity translation dictionary and also to a significant improvement of the monolingual named entity annotation quality for both languages. Experimental result shows that the dictionary size is reduced by 51.8% and the annotation quality is improved from 70.03 to 78.15 for Chinese and 73.38 to 81.46 in terms of F-score.

UR - http://www.scopus.com/inward/record.url?scp=84963811945&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84963811945&partnerID=8YFLogxK

U2 - 10.1109/ICMI.2002.1167002

DO - 10.1109/ICMI.2002.1167002

M3 - Conference contribution

SN - 0769518346

SN - 9780769518343

SP - 253

EP - 258

BT - Proceedings - 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002

PB - Institute of Electrical and Electronics Engineers Inc.

ER -