Active learning strategies for multi-label text classification

Andrea Esuli, Fabrizio Sebastiani

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Citations (Scopus)

Abstract

Active learning refers to the task of devising a ranking function that, given a classifier trained from relatively few training examples,ranks a set of additional unlabeled examples in terms of how much further information they would carry, once manually labeled, for retraining a (hopefully) better classifier. Research on active learning in text classification has so far concentrated on single-label classification; active learning for multi-label classification, instead, has either been tackled in a simulated (and, we contend, non-realistic) way, or neglected tout court.In this paper we aim to fill this gap by examining a number of realistic strategies for tackling active learning for multi-label classification. Each such strategy consists of a rule for combining the outputs returned by the individual binary classifiers as a result of classifying a given unlabeled document. We present the results of extensive experiments in which we test these strategies on two standard text classification datasets.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages102-113
Number of pages12
Volume5478 LNCS
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event31th European Conference on Information Retrieval, ECIR 2009 - Toulouse
Duration: 6 Apr 20099 Apr 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5478 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other31th European Conference on Information Retrieval, ECIR 2009
CityToulouse
Period6/4/099/4/09

Fingerprint

Text Classification
Learning Strategies
Active Learning
Labels
Classifier
Classifiers
Ranking Function
Binary
Problem-Based Learning
Output
Experiment
Strategy
Experiments

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Esuli, A., & Sebastiani, F. (2009). Active learning strategies for multi-label text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5478 LNCS, pp. 102-113). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5478 LNCS). https://doi.org/10.1007/978-3-642-00958-7_12

Active learning strategies for multi-label text classification. / Esuli, Andrea; Sebastiani, Fabrizio.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5478 LNCS 2009. p. 102-113 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5478 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Esuli, A & Sebastiani, F 2009, Active learning strategies for multi-label text classification. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 5478 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5478 LNCS, pp. 102-113, 31th European Conference on Information Retrieval, ECIR 2009, Toulouse, 6/4/09. https://doi.org/10.1007/978-3-642-00958-7_12
Esuli A, Sebastiani F. Active learning strategies for multi-label text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5478 LNCS. 2009. p. 102-113. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-00958-7_12
Esuli, Andrea ; Sebastiani, Fabrizio. / Active learning strategies for multi-label text classification. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 5478 LNCS 2009. pp. 102-113 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{b84a5744dc4b4be48a2b463aab551d38,
title = "Active learning strategies for multi-label text classification",
abstract = "Active learning refers to the task of devising a ranking function that, given a classifier trained from relatively few training examples,ranks a set of additional unlabeled examples in terms of how much further information they would carry, once manually labeled, for retraining a (hopefully) better classifier. Research on active learning in text classification has so far concentrated on single-label classification; active learning for multi-label classification, instead, has either been tackled in a simulated (and, we contend, non-realistic) way, or neglected tout court.In this paper we aim to fill this gap by examining a number of realistic strategies for tackling active learning for multi-label classification. Each such strategy consists of a rule for combining the outputs returned by the individual binary classifiers as a result of classifying a given unlabeled document. We present the results of extensive experiments in which we test these strategies on two standard text classification datasets.",
author = "Andrea Esuli and Fabrizio Sebastiani",
year = "2009",
doi = "10.1007/978-3-642-00958-7_12",
language = "English",
isbn = "3642009573",
volume = "5478 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "102--113",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Active learning strategies for multi-label text classification

AU - Esuli, Andrea

AU - Sebastiani, Fabrizio

PY - 2009

Y1 - 2009

N2 - Active learning refers to the task of devising a ranking function that, given a classifier trained from relatively few training examples,ranks a set of additional unlabeled examples in terms of how much further information they would carry, once manually labeled, for retraining a (hopefully) better classifier. Research on active learning in text classification has so far concentrated on single-label classification; active learning for multi-label classification, instead, has either been tackled in a simulated (and, we contend, non-realistic) way, or neglected tout court.In this paper we aim to fill this gap by examining a number of realistic strategies for tackling active learning for multi-label classification. Each such strategy consists of a rule for combining the outputs returned by the individual binary classifiers as a result of classifying a given unlabeled document. We present the results of extensive experiments in which we test these strategies on two standard text classification datasets.

AB - Active learning refers to the task of devising a ranking function that, given a classifier trained from relatively few training examples,ranks a set of additional unlabeled examples in terms of how much further information they would carry, once manually labeled, for retraining a (hopefully) better classifier. Research on active learning in text classification has so far concentrated on single-label classification; active learning for multi-label classification, instead, has either been tackled in a simulated (and, we contend, non-realistic) way, or neglected tout court.In this paper we aim to fill this gap by examining a number of realistic strategies for tackling active learning for multi-label classification. Each such strategy consists of a rule for combining the outputs returned by the individual binary classifiers as a result of classifying a given unlabeled document. We present the results of extensive experiments in which we test these strategies on two standard text classification datasets.

UR - http://www.scopus.com/inward/record.url?scp=67650703463&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=67650703463&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-00958-7_12

DO - 10.1007/978-3-642-00958-7_12

M3 - Conference contribution

SN - 3642009573

SN - 9783642009570

VL - 5478 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 102

EP - 113

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -