Kernel-based reranking for named-entity extraction

Truc Vien T Nguyen, Alessandro Moschitti, Giuseppe Riccardi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

We present novel kernels based on structured and unstructured features for reranking the N-best hypotheses of conditional random fields (CRFs) applied to entity extraction. The former features are generated by a polynomial kernel encoding entity features whereas tree kernels are used to model dependencies amongst tagged candidate examples. The experiments on two standard corpora in two languages, i.e. the Italian EVALITA 2009 and the English CoNLL 2003 datasets, show a large improvement on CRFs in F-measure, i.e. from 80.34% to 84.33% and from 84.86% to 88.16%, respectively. Our analysis reveals that both kernels provide a comparable improvement over the CRFs baseline. Additionally, their combination improves CRFs much more than the sum of the individual contributions, suggesting an interesting kernel synergy.

Original languageEnglish
Title of host publicationColing 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference
Pages901-909
Number of pages9
Volume2
Publication statusPublished - 1 Dec 2010
Externally publishedYes
Event23rd International Conference on Computational Linguistics, Coling 2010 - Beijing, China
Duration: 23 Aug 201027 Aug 2010

Other

Other23rd International Conference on Computational Linguistics, Coling 2010
CountryChina
CityBeijing
Period23/8/1027/8/10

Fingerprint

synergy
candidacy
Polynomials
experiment
language
Experiments
Kernel
Entity

ASJC Scopus subject areas

  • Language and Linguistics
  • Computational Theory and Mathematics
  • Linguistics and Language

Cite this

Nguyen, T. V. T., Moschitti, A., & Riccardi, G. (2010). Kernel-based reranking for named-entity extraction. In Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference (Vol. 2, pp. 901-909)

Kernel-based reranking for named-entity extraction. / Nguyen, Truc Vien T; Moschitti, Alessandro; Riccardi, Giuseppe.

Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference. Vol. 2 2010. p. 901-909.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Nguyen, TVT, Moschitti, A & Riccardi, G 2010, Kernel-based reranking for named-entity extraction. in Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference. vol. 2, pp. 901-909, 23rd International Conference on Computational Linguistics, Coling 2010, Beijing, China, 23/8/10.
Nguyen TVT, Moschitti A, Riccardi G. Kernel-based reranking for named-entity extraction. In Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference. Vol. 2. 2010. p. 901-909
Nguyen, Truc Vien T ; Moschitti, Alessandro ; Riccardi, Giuseppe. / Kernel-based reranking for named-entity extraction. Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference. Vol. 2 2010. pp. 901-909
@inproceedings{d9fe4048e66144ee834506ba3693b7e8,
title = "Kernel-based reranking for named-entity extraction",
abstract = "We present novel kernels based on structured and unstructured features for reranking the N-best hypotheses of conditional random fields (CRFs) applied to entity extraction. The former features are generated by a polynomial kernel encoding entity features whereas tree kernels are used to model dependencies amongst tagged candidate examples. The experiments on two standard corpora in two languages, i.e. the Italian EVALITA 2009 and the English CoNLL 2003 datasets, show a large improvement on CRFs in F-measure, i.e. from 80.34{\%} to 84.33{\%} and from 84.86{\%} to 88.16{\%}, respectively. Our analysis reveals that both kernels provide a comparable improvement over the CRFs baseline. Additionally, their combination improves CRFs much more than the sum of the individual contributions, suggesting an interesting kernel synergy.",
author = "Nguyen, {Truc Vien T} and Alessandro Moschitti and Giuseppe Riccardi",
year = "2010",
month = "12",
day = "1",
language = "English",
volume = "2",
pages = "901--909",
booktitle = "Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference",

}

TY - GEN

T1 - Kernel-based reranking for named-entity extraction

AU - Nguyen, Truc Vien T

AU - Moschitti, Alessandro

AU - Riccardi, Giuseppe

PY - 2010/12/1

Y1 - 2010/12/1

N2 - We present novel kernels based on structured and unstructured features for reranking the N-best hypotheses of conditional random fields (CRFs) applied to entity extraction. The former features are generated by a polynomial kernel encoding entity features whereas tree kernels are used to model dependencies amongst tagged candidate examples. The experiments on two standard corpora in two languages, i.e. the Italian EVALITA 2009 and the English CoNLL 2003 datasets, show a large improvement on CRFs in F-measure, i.e. from 80.34% to 84.33% and from 84.86% to 88.16%, respectively. Our analysis reveals that both kernels provide a comparable improvement over the CRFs baseline. Additionally, their combination improves CRFs much more than the sum of the individual contributions, suggesting an interesting kernel synergy.

AB - We present novel kernels based on structured and unstructured features for reranking the N-best hypotheses of conditional random fields (CRFs) applied to entity extraction. The former features are generated by a polynomial kernel encoding entity features whereas tree kernels are used to model dependencies amongst tagged candidate examples. The experiments on two standard corpora in two languages, i.e. the Italian EVALITA 2009 and the English CoNLL 2003 datasets, show a large improvement on CRFs in F-measure, i.e. from 80.34% to 84.33% and from 84.86% to 88.16%, respectively. Our analysis reveals that both kernels provide a comparable improvement over the CRFs baseline. Additionally, their combination improves CRFs much more than the sum of the individual contributions, suggesting an interesting kernel synergy.

UR - http://www.scopus.com/inward/record.url?scp=79959227538&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959227538&partnerID=8YFLogxK

M3 - Conference contribution

VL - 2

SP - 901

EP - 909

BT - Coling 2010 - 23rd International Conference on Computational Linguistics, Proceedings of the Conference

ER -