Rare class detection in networks

Karthik Subbian, Charu C. Aggarwal, Jaideep Srivastava, Vipin Kumar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The problem of node classification in networks is an important one in a wide variety of social networking domains. In many real applications such as product recommendations, the class of interest may be very rare. In such scenarios, it is often very difficult to learn the most relevant node classification characteristics, both because of the paucity of training data, and because of poor connectivity among rare class nodes in the network structure. Node classification methods crucially dependent upon structural homophily, and a lack of connectivity among rare class nodes can create significant challenges. However, many such social networks are content-rich, and the content-rich nature of such networks can be leveraged to compensate for the lack of structural connectivity among rare class nodes. While content-centric and semi-supervised methods have been used earlier in the context of paucity of labeled data, the rare class scenario has not been investigated in this context. In fact, we are not aware of any known classification method which is tailored towards rare class detection in networks. This paper will present a spectral approach for rare-class detection, which uses a distance-preserving transform, in order to combine the structural information in the network with the available content. We will show the advantage of this approach over traditional methods for collective classification.

Original languageEnglish
Title of host publicationSIAM International Conference on Data Mining 2015, SDM 2015
PublisherSociety for Industrial and Applied Mathematics Publications
Pages406-414
Number of pages9
ISBN (Print)9781510811522
Publication statusPublished - 2015
EventSIAM International Conference on Data Mining 2015, SDM 2015 - Vancouver, Canada
Duration: 30 Apr 20152 May 2015

Other

OtherSIAM International Conference on Data Mining 2015, SDM 2015
CountryCanada
CityVancouver
Period30/4/152/5/15

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Subbian, K., Aggarwal, C. C., Srivastava, J., & Kumar, V. (2015). Rare class detection in networks. In SIAM International Conference on Data Mining 2015, SDM 2015 (pp. 406-414). Society for Industrial and Applied Mathematics Publications.

Rare class detection in networks. / Subbian, Karthik; Aggarwal, Charu C.; Srivastava, Jaideep; Kumar, Vipin.

SIAM International Conference on Data Mining 2015, SDM 2015. Society for Industrial and Applied Mathematics Publications, 2015. p. 406-414.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Subbian, K, Aggarwal, CC, Srivastava, J & Kumar, V 2015, Rare class detection in networks. in SIAM International Conference on Data Mining 2015, SDM 2015. Society for Industrial and Applied Mathematics Publications, pp. 406-414, SIAM International Conference on Data Mining 2015, SDM 2015, Vancouver, Canada, 30/4/15.
Subbian K, Aggarwal CC, Srivastava J, Kumar V. Rare class detection in networks. In SIAM International Conference on Data Mining 2015, SDM 2015. Society for Industrial and Applied Mathematics Publications. 2015. p. 406-414
Subbian, Karthik ; Aggarwal, Charu C. ; Srivastava, Jaideep ; Kumar, Vipin. / Rare class detection in networks. SIAM International Conference on Data Mining 2015, SDM 2015. Society for Industrial and Applied Mathematics Publications, 2015. pp. 406-414
@inproceedings{50758f88af7545f8a04ecc9ef6fa561a,
title = "Rare class detection in networks",
abstract = "The problem of node classification in networks is an important one in a wide variety of social networking domains. In many real applications such as product recommendations, the class of interest may be very rare. In such scenarios, it is often very difficult to learn the most relevant node classification characteristics, both because of the paucity of training data, and because of poor connectivity among rare class nodes in the network structure. Node classification methods crucially dependent upon structural homophily, and a lack of connectivity among rare class nodes can create significant challenges. However, many such social networks are content-rich, and the content-rich nature of such networks can be leveraged to compensate for the lack of structural connectivity among rare class nodes. While content-centric and semi-supervised methods have been used earlier in the context of paucity of labeled data, the rare class scenario has not been investigated in this context. In fact, we are not aware of any known classification method which is tailored towards rare class detection in networks. This paper will present a spectral approach for rare-class detection, which uses a distance-preserving transform, in order to combine the structural information in the network with the available content. We will show the advantage of this approach over traditional methods for collective classification.",
author = "Karthik Subbian and Aggarwal, {Charu C.} and Jaideep Srivastava and Vipin Kumar",
year = "2015",
language = "English",
isbn = "9781510811522",
pages = "406--414",
booktitle = "SIAM International Conference on Data Mining 2015, SDM 2015",
publisher = "Society for Industrial and Applied Mathematics Publications",

}

TY - GEN

T1 - Rare class detection in networks

AU - Subbian, Karthik

AU - Aggarwal, Charu C.

AU - Srivastava, Jaideep

AU - Kumar, Vipin

PY - 2015

Y1 - 2015

N2 - The problem of node classification in networks is an important one in a wide variety of social networking domains. In many real applications such as product recommendations, the class of interest may be very rare. In such scenarios, it is often very difficult to learn the most relevant node classification characteristics, both because of the paucity of training data, and because of poor connectivity among rare class nodes in the network structure. Node classification methods crucially dependent upon structural homophily, and a lack of connectivity among rare class nodes can create significant challenges. However, many such social networks are content-rich, and the content-rich nature of such networks can be leveraged to compensate for the lack of structural connectivity among rare class nodes. While content-centric and semi-supervised methods have been used earlier in the context of paucity of labeled data, the rare class scenario has not been investigated in this context. In fact, we are not aware of any known classification method which is tailored towards rare class detection in networks. This paper will present a spectral approach for rare-class detection, which uses a distance-preserving transform, in order to combine the structural information in the network with the available content. We will show the advantage of this approach over traditional methods for collective classification.

AB - The problem of node classification in networks is an important one in a wide variety of social networking domains. In many real applications such as product recommendations, the class of interest may be very rare. In such scenarios, it is often very difficult to learn the most relevant node classification characteristics, both because of the paucity of training data, and because of poor connectivity among rare class nodes in the network structure. Node classification methods crucially dependent upon structural homophily, and a lack of connectivity among rare class nodes can create significant challenges. However, many such social networks are content-rich, and the content-rich nature of such networks can be leveraged to compensate for the lack of structural connectivity among rare class nodes. While content-centric and semi-supervised methods have been used earlier in the context of paucity of labeled data, the rare class scenario has not been investigated in this context. In fact, we are not aware of any known classification method which is tailored towards rare class detection in networks. This paper will present a spectral approach for rare-class detection, which uses a distance-preserving transform, in order to combine the structural information in the network with the available content. We will show the advantage of this approach over traditional methods for collective classification.

UR - http://www.scopus.com/inward/record.url?scp=84961944336&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84961944336&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781510811522

SP - 406

EP - 414

BT - SIAM International Conference on Data Mining 2015, SDM 2015

PB - Society for Industrial and Applied Mathematics Publications

ER -