FURS

Fast and Unique Representative Subset selection retaining large-scale community structure

RaghvenPhDa Mall, Rocco Langone, Johan A.K. Suykens

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

We propose a novel algorithm, FURS (Fast and Unique Representative Subset selection) to deterministically select a set of nodes from a given graph which retains the underlying community structure. FURS greedily selects nodes with high-degree centrality from most or all the communities in the network. The nodes with high-degree centrality for each community are usually located at the center rather than the periphery and can better capture the community structure. The nodes are selected such that they are not isolated but can form disconnected components. The FURS is evaluated by quality measures, such as coverage, clustering coefficients, degree distributions and variation of information. Empirically, we observe that the nodes are selected such that most or all of the communities in the original network are retained. We compare our proposed technique with state-of-the-art methods like SlashBurn, Forest-Fire, Metropolis and Snowball Expansion sampling techniques. We evaluate FURS on several synthetic and real-world networks of varying size to demonstrate the high quality of our subset while preserving the community structure. The subset generated by the FURS method can be effectively utilized by model-based approaches with out-of-sample extension properties for inferring community affiliation of the large-scale networks. A consequence of FURS is that the selected subset is also a good candidate set for simple diffusion model. We compare the spread of information over time using FURS for several real-world networks with random node selection, hubs selection, spokes selection, high eigenvector centrality, high Pagerank, high betweenness centrality and low betweenness centrality-based representative subset selection.

Original languageEnglish
Pages (from-to)1075-1095
Number of pages21
JournalSocial Network Analysis and Mining
Volume3
Issue number4
DOIs
Publication statusPublished - 1 Jan 2013
Externally publishedYes

Fingerprint

Set theory
Eigenvalues and eigenfunctions
community
Fires
Sampling
metropolis
candidacy
coverage

Keywords

  • Community detection
  • Hubs
  • Node subset selection
  • Simple diffusion model

ASJC Scopus subject areas

  • Information Systems
  • Communication
  • Media Technology
  • Human-Computer Interaction
  • Computer Science Applications

Cite this

FURS : Fast and Unique Representative Subset selection retaining large-scale community structure. / Mall, RaghvenPhDa; Langone, Rocco; Suykens, Johan A.K.

In: Social Network Analysis and Mining, Vol. 3, No. 4, 01.01.2013, p. 1075-1095.

Research output: Contribution to journalArticle

@article{3400da752cca48fa9774aa04e2928fb0,
title = "FURS: Fast and Unique Representative Subset selection retaining large-scale community structure",
abstract = "We propose a novel algorithm, FURS (Fast and Unique Representative Subset selection) to deterministically select a set of nodes from a given graph which retains the underlying community structure. FURS greedily selects nodes with high-degree centrality from most or all the communities in the network. The nodes with high-degree centrality for each community are usually located at the center rather than the periphery and can better capture the community structure. The nodes are selected such that they are not isolated but can form disconnected components. The FURS is evaluated by quality measures, such as coverage, clustering coefficients, degree distributions and variation of information. Empirically, we observe that the nodes are selected such that most or all of the communities in the original network are retained. We compare our proposed technique with state-of-the-art methods like SlashBurn, Forest-Fire, Metropolis and Snowball Expansion sampling techniques. We evaluate FURS on several synthetic and real-world networks of varying size to demonstrate the high quality of our subset while preserving the community structure. The subset generated by the FURS method can be effectively utilized by model-based approaches with out-of-sample extension properties for inferring community affiliation of the large-scale networks. A consequence of FURS is that the selected subset is also a good candidate set for simple diffusion model. We compare the spread of information over time using FURS for several real-world networks with random node selection, hubs selection, spokes selection, high eigenvector centrality, high Pagerank, high betweenness centrality and low betweenness centrality-based representative subset selection.",
keywords = "Community detection, Hubs, Node subset selection, Simple diffusion model",
author = "RaghvenPhDa Mall and Rocco Langone and Suykens, {Johan A.K.}",
year = "2013",
month = "1",
day = "1",
doi = "10.1007/s13278-013-0144-6",
language = "English",
volume = "3",
pages = "1075--1095",
journal = "Social Network Analysis and Mining",
issn = "1869-5450",
publisher = "Springer Wien",
number = "4",

}

TY - JOUR

T1 - FURS

T2 - Fast and Unique Representative Subset selection retaining large-scale community structure

AU - Mall, RaghvenPhDa

AU - Langone, Rocco

AU - Suykens, Johan A.K.

PY - 2013/1/1

Y1 - 2013/1/1

N2 - We propose a novel algorithm, FURS (Fast and Unique Representative Subset selection) to deterministically select a set of nodes from a given graph which retains the underlying community structure. FURS greedily selects nodes with high-degree centrality from most or all the communities in the network. The nodes with high-degree centrality for each community are usually located at the center rather than the periphery and can better capture the community structure. The nodes are selected such that they are not isolated but can form disconnected components. The FURS is evaluated by quality measures, such as coverage, clustering coefficients, degree distributions and variation of information. Empirically, we observe that the nodes are selected such that most or all of the communities in the original network are retained. We compare our proposed technique with state-of-the-art methods like SlashBurn, Forest-Fire, Metropolis and Snowball Expansion sampling techniques. We evaluate FURS on several synthetic and real-world networks of varying size to demonstrate the high quality of our subset while preserving the community structure. The subset generated by the FURS method can be effectively utilized by model-based approaches with out-of-sample extension properties for inferring community affiliation of the large-scale networks. A consequence of FURS is that the selected subset is also a good candidate set for simple diffusion model. We compare the spread of information over time using FURS for several real-world networks with random node selection, hubs selection, spokes selection, high eigenvector centrality, high Pagerank, high betweenness centrality and low betweenness centrality-based representative subset selection.

AB - We propose a novel algorithm, FURS (Fast and Unique Representative Subset selection) to deterministically select a set of nodes from a given graph which retains the underlying community structure. FURS greedily selects nodes with high-degree centrality from most or all the communities in the network. The nodes with high-degree centrality for each community are usually located at the center rather than the periphery and can better capture the community structure. The nodes are selected such that they are not isolated but can form disconnected components. The FURS is evaluated by quality measures, such as coverage, clustering coefficients, degree distributions and variation of information. Empirically, we observe that the nodes are selected such that most or all of the communities in the original network are retained. We compare our proposed technique with state-of-the-art methods like SlashBurn, Forest-Fire, Metropolis and Snowball Expansion sampling techniques. We evaluate FURS on several synthetic and real-world networks of varying size to demonstrate the high quality of our subset while preserving the community structure. The subset generated by the FURS method can be effectively utilized by model-based approaches with out-of-sample extension properties for inferring community affiliation of the large-scale networks. A consequence of FURS is that the selected subset is also a good candidate set for simple diffusion model. We compare the spread of information over time using FURS for several real-world networks with random node selection, hubs selection, spokes selection, high eigenvector centrality, high Pagerank, high betweenness centrality and low betweenness centrality-based representative subset selection.

KW - Community detection

KW - Hubs

KW - Node subset selection

KW - Simple diffusion model

UR - http://www.scopus.com/inward/record.url?scp=84921713163&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84921713163&partnerID=8YFLogxK

U2 - 10.1007/s13278-013-0144-6

DO - 10.1007/s13278-013-0144-6

M3 - Article

VL - 3

SP - 1075

EP - 1095

JO - Social Network Analysis and Mining

JF - Social Network Analysis and Mining

SN - 1869-5450

IS - 4

ER -