Phrase pair classification for identifying subtopics

Sujatha Das, Prasenjit Mitra, C. Lee Giles

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

Automatic identification of subtopics for a given topic is desirable because it eliminates the need for manual construction of domain-specific topic hierarchies. In this paper, we design features based on corpus statistics to design a classifier for identifying the (subtopic, topic) links between phrase pairs. We combine these features along with the commonly-used syntactic patterns to classify phrase pairs from datasets in Computer Science and WordNet. In addition, we show a novel application of our is-a-subtopic-of classifier for query expansion in Expert Search and compare it with pseudo-relevance feedback.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages489-493
Number of pages5
Volume7224 LNCS
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event34th European Conference on Information Retrieval, ECIR 2012 - Barcelona
Duration: 1 Apr 20125 Apr 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7224 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other34th European Conference on Information Retrieval, ECIR 2012
CityBarcelona
Period1/4/125/4/12

Fingerprint

Classifiers
Feature-based Design
Classifier
Pseudo-relevance Feedback
Query Expansion
WordNet
Syntactics
Computer science
Computer Science
Eliminate
Classify
Statistics
Feedback
Hierarchy
Corpus
Design
Syntax

Keywords

  • expert search
  • hypernym classification
  • query expansion

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Das, S., Mitra, P., & Lee Giles, C. (2012). Phrase pair classification for identifying subtopics. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7224 LNCS, pp. 489-493). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7224 LNCS). https://doi.org/10.1007/978-3-642-28997-2_48

Phrase pair classification for identifying subtopics. / Das, Sujatha; Mitra, Prasenjit; Lee Giles, C.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7224 LNCS 2012. p. 489-493 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7224 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Das, S, Mitra, P & Lee Giles, C 2012, Phrase pair classification for identifying subtopics. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 7224 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7224 LNCS, pp. 489-493, 34th European Conference on Information Retrieval, ECIR 2012, Barcelona, 1/4/12. https://doi.org/10.1007/978-3-642-28997-2_48
Das S, Mitra P, Lee Giles C. Phrase pair classification for identifying subtopics. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7224 LNCS. 2012. p. 489-493. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-28997-2_48
Das, Sujatha ; Mitra, Prasenjit ; Lee Giles, C. / Phrase pair classification for identifying subtopics. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7224 LNCS 2012. pp. 489-493 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{40bcb456dcd8404c92416ad1c0ac97a5,
title = "Phrase pair classification for identifying subtopics",
abstract = "Automatic identification of subtopics for a given topic is desirable because it eliminates the need for manual construction of domain-specific topic hierarchies. In this paper, we design features based on corpus statistics to design a classifier for identifying the (subtopic, topic) links between phrase pairs. We combine these features along with the commonly-used syntactic patterns to classify phrase pairs from datasets in Computer Science and WordNet. In addition, we show a novel application of our is-a-subtopic-of classifier for query expansion in Expert Search and compare it with pseudo-relevance feedback.",
keywords = "expert search, hypernym classification, query expansion",
author = "Sujatha Das and Prasenjit Mitra and {Lee Giles}, C.",
year = "2012",
doi = "10.1007/978-3-642-28997-2_48",
language = "English",
isbn = "9783642289965",
volume = "7224 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "489--493",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Phrase pair classification for identifying subtopics

AU - Das, Sujatha

AU - Mitra, Prasenjit

AU - Lee Giles, C.

PY - 2012

Y1 - 2012

N2 - Automatic identification of subtopics for a given topic is desirable because it eliminates the need for manual construction of domain-specific topic hierarchies. In this paper, we design features based on corpus statistics to design a classifier for identifying the (subtopic, topic) links between phrase pairs. We combine these features along with the commonly-used syntactic patterns to classify phrase pairs from datasets in Computer Science and WordNet. In addition, we show a novel application of our is-a-subtopic-of classifier for query expansion in Expert Search and compare it with pseudo-relevance feedback.

AB - Automatic identification of subtopics for a given topic is desirable because it eliminates the need for manual construction of domain-specific topic hierarchies. In this paper, we design features based on corpus statistics to design a classifier for identifying the (subtopic, topic) links between phrase pairs. We combine these features along with the commonly-used syntactic patterns to classify phrase pairs from datasets in Computer Science and WordNet. In addition, we show a novel application of our is-a-subtopic-of classifier for query expansion in Expert Search and compare it with pseudo-relevance feedback.

KW - expert search

KW - hypernym classification

KW - query expansion

UR - http://www.scopus.com/inward/record.url?scp=84860135465&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84860135465&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-28997-2_48

DO - 10.1007/978-3-642-28997-2_48

M3 - Conference contribution

AN - SCOPUS:84860135465

SN - 9783642289965

VL - 7224 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 489

EP - 493

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -