Improving term extraction by system combination using boosting

Jordi Vivaldi, Lluis Marques, Horacio Rodríguez

Research output: Chapter in Book/Report/Conference proceedingConference contribution

31 Citations (Scopus)

Abstract

Term extraction is the task of automatically detecting, from textual corpora, lexical units that designate concepts in thematically restricted domains (e.g. medicine). Current systems for term extraction integrate linguistic and statistical cues to perform the detection of terms. The best results have been obtained when some kind of combination of simple base term extractors is performed [14]. In this paper it is shown that this combination can be further improved by posing an additional learning problem of how to find the best combination of base term extractors. Empirical results, using AdaBoost in the metalearning step, show that the ensemble constructed surpasses the performance of all individual extractors and simple voting schemes, obtaining significantly better accuracy figures at all levels of recall.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Verlag
Pages515-526
Number of pages12
Volume2167
ISBN (Print)3540425365, 9783540425366
Publication statusPublished - 2001
Externally publishedYes
Event12th European Conference on Machine Learning, ECML 2001 - Freiburg, Germany
Duration: 5 Sep 20017 Sep 2001

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2167
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other12th European Conference on Machine Learning, ECML 2001
CountryGermany
CityFreiburg
Period5/9/017/9/01

Fingerprint

Boosting
Extractor
Adaptive boosting
Term
Linguistics
Medicine
Meta-learning
AdaBoost
Voting
Figure
Ensemble
Integrate
Unit

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Vivaldi, J., Marques, L., & Rodríguez, H. (2001). Improving term extraction by system combination using boosting. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2167, pp. 515-526). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2167). Springer Verlag.

Improving term extraction by system combination using boosting. / Vivaldi, Jordi; Marques, Lluis; Rodríguez, Horacio.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 2167 Springer Verlag, 2001. p. 515-526 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2167).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Vivaldi, J, Marques, L & Rodríguez, H 2001, Improving term extraction by system combination using boosting. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 2167, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2167, Springer Verlag, pp. 515-526, 12th European Conference on Machine Learning, ECML 2001, Freiburg, Germany, 5/9/01.
Vivaldi J, Marques L, Rodríguez H. Improving term extraction by system combination using boosting. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 2167. Springer Verlag. 2001. p. 515-526. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
Vivaldi, Jordi ; Marques, Lluis ; Rodríguez, Horacio. / Improving term extraction by system combination using boosting. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 2167 Springer Verlag, 2001. pp. 515-526 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{648e47ea6fdc4407acd63409c8e7bd85,
title = "Improving term extraction by system combination using boosting",
abstract = "Term extraction is the task of automatically detecting, from textual corpora, lexical units that designate concepts in thematically restricted domains (e.g. medicine). Current systems for term extraction integrate linguistic and statistical cues to perform the detection of terms. The best results have been obtained when some kind of combination of simple base term extractors is performed [14]. In this paper it is shown that this combination can be further improved by posing an additional learning problem of how to find the best combination of base term extractors. Empirical results, using AdaBoost in the metalearning step, show that the ensemble constructed surpasses the performance of all individual extractors and simple voting schemes, obtaining significantly better accuracy figures at all levels of recall.",
author = "Jordi Vivaldi and Lluis Marques and Horacio Rodr{\'i}guez",
year = "2001",
language = "English",
isbn = "3540425365",
volume = "2167",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "515--526",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Improving term extraction by system combination using boosting

AU - Vivaldi, Jordi

AU - Marques, Lluis

AU - Rodríguez, Horacio

PY - 2001

Y1 - 2001

N2 - Term extraction is the task of automatically detecting, from textual corpora, lexical units that designate concepts in thematically restricted domains (e.g. medicine). Current systems for term extraction integrate linguistic and statistical cues to perform the detection of terms. The best results have been obtained when some kind of combination of simple base term extractors is performed [14]. In this paper it is shown that this combination can be further improved by posing an additional learning problem of how to find the best combination of base term extractors. Empirical results, using AdaBoost in the metalearning step, show that the ensemble constructed surpasses the performance of all individual extractors and simple voting schemes, obtaining significantly better accuracy figures at all levels of recall.

AB - Term extraction is the task of automatically detecting, from textual corpora, lexical units that designate concepts in thematically restricted domains (e.g. medicine). Current systems for term extraction integrate linguistic and statistical cues to perform the detection of terms. The best results have been obtained when some kind of combination of simple base term extractors is performed [14]. In this paper it is shown that this combination can be further improved by posing an additional learning problem of how to find the best combination of base term extractors. Empirical results, using AdaBoost in the metalearning step, show that the ensemble constructed surpasses the performance of all individual extractors and simple voting schemes, obtaining significantly better accuracy figures at all levels of recall.

UR - http://www.scopus.com/inward/record.url?scp=84948132094&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84948132094&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84948132094

SN - 3540425365

SN - 9783540425366

VL - 2167

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 515

EP - 526

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

PB - Springer Verlag

ER -